Isearch
Encyclopedia
Isearch is open-source
Open-source software
Open-source software is computer software that is available in source code form: the source code and certain other rights normally reserved for copyright holders are provided under a software license that permits users to study, change, improve and at times also to distribute the software.Open...

 text retrieval software first developed in 1994 as part of the Isite Z39.50
Z39.50
Z39.50 is a client–server protocol for searching and retrieving information from remote computer databases. It is covered by ANSI/NISO standard Z39.50, and ISO standard 23950. The standard's maintenance agency is the Library of Congress....

 information framework. The project started at the Clearinghouse for Networked Information Discovery and Retrieval (CNIDR) of the North Carolina supercomputing center MCNC and funded by the National Science Foundation
National Science Foundation
The National Science Foundation is a United States government agency that supports fundamental research and education in all the non-medical fields of science and engineering. Its medical counterpart is the National Institutes of Health...

 to follow in the track of WAIS
Wide area information server
Wide Area Information Servers or WAIS is a client–server text searching system that uses the ANSI Standard Z39.50 Information Retrieval Service Definition and Protocol Specifications for Library Applications" to search index databases on remote computers...

 and develop prototype systems for distributed information networks encompassing Internet applications, library catalogs and other information resources.

The main features of Isearch include full text and field searching, relevance ranking, Boolean queries, and support for many document types such as HTML, mail folders, list digests, MEDLINE, BibTeX, SGML/XML, FGDC Metadata, NASA DIF, ANZLIC metadata, ISO 19115 metadata and many other resource types and document formats.

It was the first search engine to be designed from the ground up to support SGML and ISO 23950 search and retrieval. It included many innovations including the "document type" model—which is simply a (object oriented) method of associating each document with a class of functions providing a standard interface for accessing the document. It was one of the first engines (if not the first) to ever support XML.

The Isearch search/indexing text algorithms were based on Gaston Gonnet
Gaston Gonnet
Gaston H. Gonnet is a Uruguayan computer scientist and entrepreneur. He is best known for his contributions to the Maple computer algebra system and the creation of an electronic version of the Oxford English Dictionary.- Education and professional life :...

's seminal work into PAT arrays and trees for text retrieval--- ideas that were developed for the New Oxford English Dictionary Project at the Univ. of Waterloo, and provided the seeds for Tim Bray
Tim Bray
Timothy William Bray is a Canadian software developer and entrepreneur. He co-founded Open Text Corporation and Antarctica Systems. Bray was Director of Web Technologies at Sun Microsystems from early 2004 to early 2010. Since then he has served as a Developer Advocate at Google, focusing on...

's PAT SGML engine that formed the basis of Open Text
Open text
In semiotic analysis, an open text is a text that allows multiple or mediated interpretation by the readers. In contrast, a closed text leads the reader to one intended interpretation....

. One of the limiting factors, however, of the Isearch design was that it was not well suited to handle the extremely large data sets that became popular in the mid to late 1990s. In many cases Isearch was adapted or modified to use different algorithms but usually retained the document type model and the architectural relationship with Isite.

Isearch was widely adopted and used in hundreds of public search sites, including many high profile projects such as the U.S. Patent and Trademark Office (USPTO) patent search, the Federal Geographic Data Clearinghouse (FGDC), the NASA Global Change Master Directory, the NASA EOS Guide System, the NASA Catalog Interoperability Project, the Astronomical pre-print service based at the Space Telescope Science Institute, The PCT Electronic Gazette at the World Intellectual Property Organization (WIPO), Linsearch (a search engine for Open Source Software designed by Miles Efron), the SAGE Project of the Special Collections Department at Emory University, Eco Companion Australasia (an environmental geospatial resources catalog), Australian National Genomic Information Service (ANGIS), the Open Directory Project
Open Directory Project
The Open Directory Project , also known as Dmoz , is a multilingual open content directory of World Wide Web links. It is owned by Netscape but it is constructed and maintained by a community of volunteer editors.ODP uses a hierarchical ontology scheme for organizing site listings...

 and numerous governmental portals in the context of the Government Information Locator Service (GILS) GPO
United States Government Printing Office
The United States Government Printing Office is an agency of the legislative branch of the United States federal government. The office prints documents produced by and for the federal government, including the Supreme Court, the Congress, the Executive Office of the President, executive...

 mandate (ended in 2005?).

From 1994 to 1998 most of the development was centered around the Clearinghouse for Networked Information Discovery and Retrieval (CNIDR) in North Carolina (Engine core) and BSn in Germany (Doctypes). By 1998 much of the open-source Isearch core developers re-focused development into several spin-offs. In 1998 it became part of the Advanced Search Facility reference software platform funded by the U.S. Department of Commerce.

A/WWW Enterprises now maintains the open source version for public usage, supported by paying government clients, such as the U.S. Patent and Trademark Office, NASA, and the FGDC who have provided support to enhance the functionality and reliability of the software. The software suite is considered a reference implementation of catalog service software.

As of 2010, the open source version of Isearch is still used on 250+ nodes of FGDC, and by ANZLIC in Australia and selected Geospatial OneStop contributors to facilitate harvesting by GOS, including NOAA, Census Bureau and the Tenn. Field Office of the US Fish and Wildlife Service, among others.

External links


Comparisons

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK