Recoll
Encyclopedia
Recoll is a desktop search
tool that provides efficient full text search
(from single-word to arbitrarily complex boolean searches) in a friendly GUI
, with minimum technical sophistication and few mandatory external dependencies. It runs under many Unix
-like operating systems, and is mostly independent of the desktop environment
.
Recoll was designed not to require a permanent daemon
. It updates its index at designed intervals (for example through Cron
tasks). Only if desired, the indexing task can run as a file-system monitoring daemon for real-time index updates.
The Recoll document conversion and text extraction architecture makes it extremely easy to write new filters, and many document types are supported.
Desktop search
Desktop search is the name for the field of search tools which search the contents of a user's own computer files, rather than searching the Internet...
tool that provides efficient full text search
Full text search
In text retrieval, full text search refers to techniques for searching a single computer-stored document or a collection in a full text database...
(from single-word to arbitrarily complex boolean searches) in a friendly GUI
Gui
Gui or guee is a generic term to refer to grilled dishes in Korean cuisine. These most commonly have meat or fish as their primary ingredient, but may in some cases also comprise grilled vegetables or other vegetarian ingredients. The term derives from the verb, "gupda" in Korean, which literally...
, with minimum technical sophistication and few mandatory external dependencies. It runs under many Unix
Unix
Unix is a multitasking, multi-user computer operating system originally developed in 1969 by a group of AT&T employees at Bell Labs, including Ken Thompson, Dennis Ritchie, Brian Kernighan, Douglas McIlroy, and Joe Ossanna...
-like operating systems, and is mostly independent of the desktop environment
Desktop environment
In graphical computing, a desktop environment commonly refers to a style of graphical user interface derived from the desktop metaphor that is seen on most modern personal computers. These GUIs help the user in easily accessing, configuring, and modifying many important and frequently accessed...
.
Recoll was designed not to require a permanent daemon
Daemon (computing)
In Unix and other multitasking computer operating systems, a daemon is a computer program that runs as a background process, rather than being under the direct control of an interactive user...
. It updates its index at designed intervals (for example through Cron
Cron
Cron is a time-based job scheduler in Unix-like computer operating systems. Cron enables users to schedule jobs to run periodically at certain times or dates...
tasks). Only if desired, the indexing task can run as a file-system monitoring daemon for real-time index updates.
The Recoll document conversion and text extraction architecture makes it extremely easy to write new filters, and many document types are supported.
Features
- Qt GUIGuiGui or guee is a generic term to refer to grilled dishes in Korean cuisine. These most commonly have meat or fish as their primary ingredient, but may in some cases also comprise grilled vegetables or other vegetarian ingredients. The term derives from the verb, "gupda" in Korean, which literally...
. - XapianXapianXapian is an open source probabilistic information retrieval library, released under the GNU General Public License . It is a full text search engine library for programmers....
backend. - Indexes the contents of many document types: text, HTMLHTMLHyperText Markup Language is the predominant markup language for web pages. HTML elements are the basic building-blocks of webpages....
, E-MailE-mailElectronic mail, commonly known as email or e-mail, is a method of exchanging digital messages from an author to one or more recipients. Modern email operates across the Internet or other computer networks. Some early email systems required that the author and the recipient both be online at the...
stores of all kinds, OpenOffice, Microsoft OfficeMicrosoft OfficeMicrosoft Office is a non-free commercial office suite of inter-related desktop applications, servers and services for the Microsoft Windows and Mac OS X operating systems, introduced by Microsoft in August 1, 1989. Initially a marketing term for a bundled set of applications, the first version of...
and Office Open XML, AbiWordAbiWordAbiWord is a free and open source software word processor. It was originally started by SourceGear Corporation as the first part of a proposed AbiSuite. Development stopped when SourceGear changed their focus to Internet appliances. AbiWord was adopted by some open source developers and AbiWord...
, KWordKWordCalligra Words is a free word processor, part of Calligra Suite and developed by KDE.The text-layout scheme in Words is based on frames, making it similar to FrameMaker by Adobe. These can be placed anywhere on the page, and can incorporate text, graphics and embedded objects...
, Gaim, LyxLyXLyX is a document processor following the self-coined "what you see is what you mean" paradigm , as opposed to the WYSIWYG ideas used by word processors...
, ScribusScribusScribus is a desktop publishing application, released under the GNU General Public License as free software. It is based on the free Qt toolkit, therefore native versions are available for Linux, Unix-like operating systems, Mac OS X, Microsoft Windows, and OS/2...
, PDF, WordPerfectWordPerfectWordPerfect is a word processing application, now owned by Corel.Bruce Bastian, a Brigham Young University graduate student, and BYU computer science professor Dr. Alan Ashton joined forces to design a word processing system for the city of Orem's Data General Corp. minicomputer system in 1979...
, PostScriptPostScriptPostScript is a dynamically typed concatenative programming language created by John Warnock and Charles Geschke in 1982. It is best known for its use as a page description language in the electronic and desktop publishing areas. Adobe PostScript 3 is also the worldwide printing and imaging...
, RTFRich Text FormatThe Rich Text Format is a proprietary document file format with published specification developed by Microsoft Corporation since 1987 for Microsoft products and for cross-platform document interchange....
, TeXTeXTeX is a typesetting system designed and mostly written by Donald Knuth and released in 1978. Within the typesetting system, its name is formatted as ....
, DVI, DjVuDjVuDjVu is a computer file format designed primarily to store scanned documents, especially those containing a combination of text, line drawings, and photographs. It uses technologies such as image layer separation of text and background/images, progressive loading, arithmetic coding, and lossy...
, MP3MP3MPEG-1 or MPEG-2 Audio Layer III, more commonly referred to as MP3, is a patented digital audio encoding format using a form of lossy data compression...
and other audio file formats, JPEGJPEGIn computing, JPEG . The degree of compression can be adjusted, allowing a selectable tradeoff between storage size and image quality. JPEG typically achieves 10:1 compression with little perceptible loss in image quality....
and other image file formats. - Recursively processes embedded documents (E-MailE-mailElectronic mail, commonly known as email or e-mail, is a method of exchanging digital messages from an author to one or more recipients. Modern email operates across the Internet or other computer networks. Some early email systems required that the author and the recipient both be online at the...
attachments, ZipZIP (file format)Zip is a file format used for data compression and archiving. A zip file contains one or more files that have been compressed, to reduce file size, or stored as is...
archives) to arbitrary depths. - Powerful query facilities, with boolean searches, wildcards, phrases, proximity, filter on file types and directory tree. GUI Boolean search build tool.
- XesamXesamXesam is a specification promoted by freedesktop.org which aims to provide a unified framework for desktop search...
query language support - Word stemmingStemmingIn linguistic morphology and information retrieval, stemming is the process for reducing inflected words to their stem, base or root form—generally a written word form. The stem need not be identical to the morphological root of the word; it is usually sufficient that related words map to the same...
is performed at query time (can switch stemming language after indexing). - Multiple indexes selectable at query time (ie: personal + system indexes).
- Natively based on Unicode. Supports many languages and input character sets, including good support for east asian texts (CJKCJKCJK is a collective term for Chinese, Japanese, and Korean, which is used in the field of software and communications internationalization.The term CJKV means CJK plus Vietnamese, which constitute the main East Asian languages.- Characteristics :...
). - MD5MD5The MD5 Message-Digest Algorithm is a widely used cryptographic hash function that produces a 128-bit hash value. Specified in RFC 1321, MD5 has been employed in a wide variety of security applications, and is also commonly used to check data integrity...
document hashes for the elimination of duplicates in result lists. - Batch and real-time indexing modes.
- PythonPython (programming language)Python is a general-purpose, high-level programming language whose design philosophy emphasizes code readability. Python claims to "[combine] remarkable power with very clear syntax", and its standard library is large and comprehensive...
API. - Kicker (KDE)Kicker (KDE)Kicker is the main panel used in K Desktop Environment 3 and earlier. It can be customized by the user. By default, it has the K Menu, a Desktop Access button, a Home button, a Konqueror button, a Kontact button, and a Help button...
applet for easy launching. - Easy installation. No database daemon, web server or exotic language necessary.
See also
- Desktop searchDesktop searchDesktop search is the name for the field of search tools which search the contents of a user's own computer files, rather than searching the Internet...
- List of desktop search engines
- XesamXesamXesam is a specification promoted by freedesktop.org which aims to provide a unified framework for desktop search...