Languageware
Encyclopedia
LanguageWare is a natural language processing
Natural language processing
Natural language processing is a field of computer science and linguistics concerned with the interactions between computers and human languages; it began as a branch of artificial intelligence....

 (NLP) technology developed by IBM
IBM
International Business Machines Corporation or IBM is an American multinational technology and consulting corporation headquartered in Armonk, New York, United States. IBM manufactures and sells computer hardware and software, and it offers infrastructure, hosting and consulting services in areas...

, that allows applications to process natural language text. It comprises a set of Java libraries which provide a range of NLP
Natural language processing
Natural language processing is a field of computer science and linguistics concerned with the interactions between computers and human languages; it began as a branch of artificial intelligence....

 functions: language identification, text segmentation/tokenization, normalization, entity and relationship extraction, and semantic analysis and disambiguation. The analysis engine uses Finite State Machine
Finite state machine
A finite-state machine or finite-state automaton , or simply a state machine, is a mathematical model used to design computer programs and digital logic circuits. It is conceived as an abstract machine that can be in one of a finite number of states...

 approach at multiple levels, which aids its performance characteristics, while maintaining a reasonably small footprint.

The behaviour of the system is driven by a set of configurable lexico-semantic resources which describe the characteristics and domain of the processed language. A default set of resources comes as part of LanguageWare and these describe the native language characteristics, such as morphology, and the basic vocabulary for the language. Supplemental resources have been created which capture additional vocabularies, terminologies, rules and grammars, which may be generic to the language or specific to one or more domains.

A set of Eclipse
Eclipse (software)
Eclipse is a multi-language software development environment comprising an integrated development environment and an extensible plug-in system...

-based customization tooling, LanguageWare Resource Workbench, is available on IBM's alphaWorks site, and allows domain knowledge to be compiled into these resources and thereby incorporated into the analysis process.

LanguageWare can be deployed as a set of UIMA
Uima
UIMA stands for Unstructured Information Management Architecture. An OASIS standard as of March 2009, UIMA is to date the only industry standard for content analytics....

-compliant annotators, Eclipse plug-ins or Web Services.

See also

  • UIMA
    Uima
    UIMA stands for Unstructured Information Management Architecture. An OASIS standard as of March 2009, UIMA is to date the only industry standard for content analytics....

  • Linguistics
    Linguistics
    Linguistics is the scientific study of human language. Linguistics can be broadly broken into three categories or subfields of study: language form, language meaning, and language in context....

  • Semantics
    Semantics
    Semantics is the study of meaning. It focuses on the relation between signifiers, such as words, phrases, signs and symbols, and what they stand for, their denotata....

  • Semantic Web
    Semantic Web
    The Semantic Web is a collaborative movement led by the World Wide Web Consortium that promotes common formats for data on the World Wide Web. By encouraging the inclusion of semantic content in web pages, the Semantic Web aims at converting the current web of unstructured documents into a "web of...

  • Web services
  • Service-oriented architecture
    Service-oriented architecture
    In software engineering, a Service-Oriented Architecture is a set of principles and methodologies for designing and developing software in the form of interoperable services. These services are well-defined business functionalities that are built as software components that can be reused for...

  • Formal language
    Formal language
    A formal language is a set of words—that is, finite strings of letters, symbols, or tokens that are defined in the language. The set from which these letters are taken is the alphabet over which the language is defined. A formal language is often defined by means of a formal grammar...

  • Finite state machine
    Finite state machine
    A finite-state machine or finite-state automaton , or simply a state machine, is a mathematical model used to design computer programs and digital logic circuits. It is conceived as an abstract machine that can be in one of a finite number of states...

  • IBM Omnifind
  • Data Discovery and Query Builder
    Data Discovery and Query Builder
    Data Discovery and Query Builder is a data abstraction technology, developed by IBM, that allows users to retrieve information from a data warehouse, in terms of the user's specific area of expertise instead of SQL....


External links


Related Papers

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK