Recoll
Encyclopedia
Recoll is a desktop search
Desktop search
Desktop search is the name for the field of search tools which search the contents of a user's own computer files, rather than searching the Internet...

 tool that provides efficient full text search
Full text search
In text retrieval, full text search refers to techniques for searching a single computer-stored document or a collection in a full text database...

 (from single-word to arbitrarily complex boolean searches) in a friendly GUI
Gui
Gui or guee is a generic term to refer to grilled dishes in Korean cuisine. These most commonly have meat or fish as their primary ingredient, but may in some cases also comprise grilled vegetables or other vegetarian ingredients. The term derives from the verb, "gupda" in Korean, which literally...

, with minimum technical sophistication and few mandatory external dependencies. It runs under many Unix
Unix
Unix is a multitasking, multi-user computer operating system originally developed in 1969 by a group of AT&T employees at Bell Labs, including Ken Thompson, Dennis Ritchie, Brian Kernighan, Douglas McIlroy, and Joe Ossanna...

-like operating systems, and is mostly independent of the desktop environment
Desktop environment
In graphical computing, a desktop environment commonly refers to a style of graphical user interface derived from the desktop metaphor that is seen on most modern personal computers. These GUIs help the user in easily accessing, configuring, and modifying many important and frequently accessed...

.

Recoll was designed not to require a permanent daemon
Daemon (computing)
In Unix and other multitasking computer operating systems, a daemon is a computer program that runs as a background process, rather than being under the direct control of an interactive user...

. It updates its index at designed intervals (for example through Cron
Cron
Cron is a time-based job scheduler in Unix-like computer operating systems. Cron enables users to schedule jobs to run periodically at certain times or dates...

 tasks). Only if desired, the indexing task can run as a file-system monitoring daemon for real-time index updates.

The Recoll document conversion and text extraction architecture makes it extremely easy to write new filters, and many document types are supported.

Features

  • Qt GUI
    Gui
    Gui or guee is a generic term to refer to grilled dishes in Korean cuisine. These most commonly have meat or fish as their primary ingredient, but may in some cases also comprise grilled vegetables or other vegetarian ingredients. The term derives from the verb, "gupda" in Korean, which literally...

    .
  • Xapian
    Xapian
    Xapian is an open source probabilistic information retrieval library, released under the GNU General Public License . It is a full text search engine library for programmers....

     backend.
  • Indexes the contents of many document types: text, HTML
    HTML
    HyperText Markup Language is the predominant markup language for web pages. HTML elements are the basic building-blocks of webpages....

    , E-Mail
    E-mail
    Electronic mail, commonly known as email or e-mail, is a method of exchanging digital messages from an author to one or more recipients. Modern email operates across the Internet or other computer networks. Some early email systems required that the author and the recipient both be online at the...

     stores of all kinds, OpenOffice, Microsoft Office
    Microsoft Office
    Microsoft Office is a non-free commercial office suite of inter-related desktop applications, servers and services for the Microsoft Windows and Mac OS X operating systems, introduced by Microsoft in August 1, 1989. Initially a marketing term for a bundled set of applications, the first version of...

     and Office Open XML, AbiWord
    AbiWord
    AbiWord is a free and open source software word processor. It was originally started by SourceGear Corporation as the first part of a proposed AbiSuite. Development stopped when SourceGear changed their focus to Internet appliances. AbiWord was adopted by some open source developers and AbiWord...

    , KWord
    KWord
    Calligra Words is a free word processor, part of Calligra Suite and developed by KDE.The text-layout scheme in Words is based on frames, making it similar to FrameMaker by Adobe. These can be placed anywhere on the page, and can incorporate text, graphics and embedded objects...

    , Gaim, Lyx
    LyX
    LyX is a document processor following the self-coined "what you see is what you mean" paradigm , as opposed to the WYSIWYG ideas used by word processors...

    , Scribus
    Scribus
    Scribus is a desktop publishing application, released under the GNU General Public License as free software. It is based on the free Qt toolkit, therefore native versions are available for Linux, Unix-like operating systems, Mac OS X, Microsoft Windows, and OS/2...

    , PDF, WordPerfect
    WordPerfect
    WordPerfect is a word processing application, now owned by Corel.Bruce Bastian, a Brigham Young University graduate student, and BYU computer science professor Dr. Alan Ashton joined forces to design a word processing system for the city of Orem's Data General Corp. minicomputer system in 1979...

    , PostScript
    PostScript
    PostScript is a dynamically typed concatenative programming language created by John Warnock and Charles Geschke in 1982. It is best known for its use as a page description language in the electronic and desktop publishing areas. Adobe PostScript 3 is also the worldwide printing and imaging...

    , RTF
    Rich Text Format
    The Rich Text Format is a proprietary document file format with published specification developed by Microsoft Corporation since 1987 for Microsoft products and for cross-platform document interchange....

    , TeX
    TeX
    TeX is a typesetting system designed and mostly written by Donald Knuth and released in 1978. Within the typesetting system, its name is formatted as ....

    , DVI, DjVu
    DjVu
    DjVu is a computer file format designed primarily to store scanned documents, especially those containing a combination of text, line drawings, and photographs. It uses technologies such as image layer separation of text and background/images, progressive loading, arithmetic coding, and lossy...

    , MP3
    MP3
    MPEG-1 or MPEG-2 Audio Layer III, more commonly referred to as MP3, is a patented digital audio encoding format using a form of lossy data compression...

     and other audio file formats, JPEG
    JPEG
    In computing, JPEG . The degree of compression can be adjusted, allowing a selectable tradeoff between storage size and image quality. JPEG typically achieves 10:1 compression with little perceptible loss in image quality....

     and other image file formats.
  • Recursively processes embedded documents (E-Mail
    E-mail
    Electronic mail, commonly known as email or e-mail, is a method of exchanging digital messages from an author to one or more recipients. Modern email operates across the Internet or other computer networks. Some early email systems required that the author and the recipient both be online at the...

     attachments, Zip
    ZIP (file format)
    Zip is a file format used for data compression and archiving. A zip file contains one or more files that have been compressed, to reduce file size, or stored as is...

     archives) to arbitrary depths.
  • Powerful query facilities, with boolean searches, wildcards, phrases, proximity, filter on file types and directory tree. GUI Boolean search build tool.
  • Xesam
    Xesam
    Xesam is a specification promoted by freedesktop.org which aims to provide a unified framework for desktop search...

     query language support
  • Word stemming
    Stemming
    In linguistic morphology and information retrieval, stemming is the process for reducing inflected words to their stem, base or root form—generally a written word form. The stem need not be identical to the morphological root of the word; it is usually sufficient that related words map to the same...

     is performed at query time (can switch stemming language after indexing).
  • Multiple indexes selectable at query time (ie: personal + system indexes).
  • Natively based on Unicode. Supports many languages and input character sets, including good support for east asian texts (CJK
    CJK
    CJK is a collective term for Chinese, Japanese, and Korean, which is used in the field of software and communications internationalization.The term CJKV means CJK plus Vietnamese, which constitute the main East Asian languages.- Characteristics :...

    ).
  • MD5
    MD5
    The MD5 Message-Digest Algorithm is a widely used cryptographic hash function that produces a 128-bit hash value. Specified in RFC 1321, MD5 has been employed in a wide variety of security applications, and is also commonly used to check data integrity...

     document hashes for the elimination of duplicates in result lists.
  • Batch and real-time indexing modes.
  • Python
    Python (programming language)
    Python is a general-purpose, high-level programming language whose design philosophy emphasizes code readability. Python claims to "[combine] remarkable power with very clear syntax", and its standard library is large and comprehensive...

     API.
  • Kicker (KDE)
    Kicker (KDE)
    Kicker is the main panel used in K Desktop Environment 3 and earlier. It can be customized by the user. By default, it has the K Menu, a Desktop Access button, a Home button, a Konqueror button, a Kontact button, and a Help button...

     applet for easy launching.
  • Easy installation. No database daemon, web server or exotic language necessary.

See also

  • Desktop search
    Desktop search
    Desktop search is the name for the field of search tools which search the contents of a user's own computer files, rather than searching the Internet...

  • List of desktop search engines
  • Xesam
    Xesam
    Xesam is a specification promoted by freedesktop.org which aims to provide a unified framework for desktop search...


External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK