Enterprise Information Integration
Encyclopedia
Enterprise Information Integration (EII), is a process of information integration
Information integration
Information integration is the merging of information from disparate sources with differing conceptual, contextual and typographical representations. It is used in data mining and consolidation of data from unstructured or semi-structured resources...

, using data abstraction to provide a unified interface (known as uniform data access
Uniform data access
Uniform data access is a computational concept describing an even-ness of connectivity and controllability across numerous target data sources....

) for viewing all the data within an organization, and a single set of structures and naming conventions (known as uniform information representation
Uniform information representation
Uniform information representation is an analytical concept, referring to a process which allows information from several realms or disciplines to be displayed and worked with as if it came from the same realm or discipline...

) to represent this data; the goal of EII is to get a large set of heterogeneous data sources to appear to a user or system as a single, homogeneous data source.

Overview

Data
Data
The term data refers to qualitative or quantitative attributes of a variable or set of variables. Data are typically the results of measurements and can be the basis of graphs, images, or observations of a set of variables. Data are often viewed as the lowest level of abstraction from which...

 within an enterprise
Enterprise architecture
An enterprise architecture is a rigorous description of the structure of an enterprise, which comprises enterprise components , the externally visible properties of those components, and the relationships between them...

 can be stored in various formats, including relational database
Relational database
A relational database is a database that conforms to relational model theory. The software used in a relational database is called a relational database management system . Colloquial use of the term "relational database" may refer to the RDBMS software, or the relational database itself...

s (which themselves come in a large number of varieties), text files, XML
XML
Extensible Markup Language is a set of rules for encoding documents in machine-readable form. It is defined in the XML 1.0 Specification produced by the W3C, and several other related specifications, all gratis open standards....

 files, spreadsheet
Spreadsheet
A spreadsheet is a computer application that simulates a paper accounting worksheet. It displays multiple cells usually in a two-dimensional matrix or grid consisting of rows and columns. Each cell contains alphanumeric text, numeric values or formulas...

s and a variety of proprietary storage
Data storage device
thumb|200px|right|A reel-to-reel tape recorder .The magnetic tape is a data storage medium. The recorder is data storage equipment using a portable medium to store the data....

 methods, each with their own indexing
Index (information technology)
In computer science, an index can be:# an integer that identifies an array element# a data structure that enables sublinear-time lookup -Array element identifier:...

 and data access
Data access
Data access typically refers to software and activities related to storing, retrieving, or acting on data housed in a database or other repository...

 methods.

Standardized data access API
Application programming interface
An application programming interface is a source code based specification intended to be used as an interface by software components to communicate with each other...

s have emerged, that offer a specific set of commands to retrieve and modify data from a generic data source. Many applications exist that implement these APIs' commands across various data sources, most notably relational databases. Such APIs include ODBC, JDBC, OLE DB
OLE DB
OLE DB is an API designed by Microsoft for accessing data from a variety of sources in an uniform manner. It is a set of interfaces implemented using the Component Object Model ; it is otherwise unrelated to OLE...

, and more recently ADO.NET
ADO.NET
ADO.NET is a set of computer software components that programmers can use to access data and data services. It is a part of the base class library that is included with the Microsoft .NET Framework. It is commonly used by programmers to access and modify data stored in relational database systems,...

.

There are also standard formats for representing data within a file, that are very important to information integration. The best-known of these is XML
XML
Extensible Markup Language is a set of rules for encoding documents in machine-readable form. It is defined in the XML 1.0 Specification produced by the W3C, and several other related specifications, all gratis open standards....

, which has emerged as a standard universal representation format. There are also more specific XML "grammars" defined for specific types of data, such as Geography Markup Language
Geography Markup Language
The Geography Markup Language is the XML grammar defined by the Open Geospatial Consortium to express geographical features. GML serves as a modeling language for geographic systems as well as an open interchange format for geographic transactions on the Internet...

 for expressing geographical features, and Directory Service Markup Language
Directory Service Markup Language
Directory Services Markup Language is a representation of directory service information in an XML syntax.The DSML version 1 effort was announced by creator Bowstreet on July 12, 1999. Initiative supporters include AOL-Netscape, Sun Microsystems, Oracle, Novell, Microsoft, and IBM...

, for holding directory-style information. In addition, non-XML standard formats exist, such as iCalendar
ICalendar
iCalendar is a computer file format which allows Internet users to send meeting requests and tasks to other Internet users, via email, or sharing files with an extension of .ics...

, for representing calendar information, and vCard
VCard
vCard is a file format standard for electronic business cards. vCards are often attached to e-mail messages, but can be exchanged in other ways, such as on the World Wide Web or Instant Messaging...

, for business card
Business card
Business cards are cards bearing business information about a company or individual. They are shared during formal introductions as a convenience and a memory aid. A business card typically includes the giver's name, company affiliation and contact information such as street addresses, telephone...

 information.

Enterprise Information Integration (EII) applies data integration commercially. Despite the theoretical problems described above, the private sector shows more concern with the problems of data integration as a viable product.
EII emphasizes neither on correctness nor tractability, but speed and simplicity. An EII industry has emerged, but many professionals believe it does not perform to its full potential. Practitioners cite the following major issues which EII must address for the industry to become mature:

simplicity of understanding : Answering queries with views arouses interest from a theoretical standpoint, but difficulties in understanding how to incorporate it as an "enterprise solution". Some developers believe it should be merged with EAI
Enterprise application integration
Enterprise Application Integration is defined as the use of software and computer systems architectural principles to integrate a set of enterprise computer applications.- Overview :...

. Others believe it should be incorporated with ETL
Extract, transform, load
Extract, transform and load is a process in database usage and especially in data warehousing that involves:* Extracting data from outside sources* Transforming it to fit operational needs...

 systems, citing customers' confusion over the differences between the two services.
simplicity of deployment : Even if recognized as a solution to a problem, EII currently takes time to apply and offers complexities in deployment. People have proposed a variety of schema-less solutions such as "Lean Middleware", but ease-of-use and speed of employment appear inversely proportional to the generality of such systems. Others cite the need for standard data interfaces to speed and simplify the integration process in practice.
handling higher-order information : Analysts experience difficulty — even with a functioning information integration system — in determining whether the sources in the database will satisfy a given application. Answering these kinds of questions about a set of repositories requires semantic information like metadata
Metadata
The term metadata is an ambiguous term which is used for two fundamentally different concepts . Although the expression "data about data" is often used, it does not apply to both in the same way. Structural metadata, the design and specification of data structures, cannot be about data, because at...

 and/or ontologies. The few commercial tools that leverage this information remain in their infancy.

Applications

EII products enable loose coupling
Loose coupling
In computing and systems design a loosely coupled system is one where each of its components has, or makes use of, little or no knowledge of the definitions of other separate components. The notion was introduced into organizational studies by Karl Weick...

 between homogeneous-data consuming client applications and services and heterogeneous-data stores. Such client applications and services include Desktop Productivity Tools (spreadsheet
Spreadsheet
A spreadsheet is a computer application that simulates a paper accounting worksheet. It displays multiple cells usually in a two-dimensional matrix or grid consisting of rows and columns. Each cell contains alphanumeric text, numeric values or formulas...

s, word processor
Word processor
A word processor is a computer application used for the production of any sort of printable material....

s, presentation software, etc.), Development Environment
Integrated development environment
An integrated development environment is a software application that provides comprehensive facilities to computer programmers for software development...

s and Framework
Software framework
In computer programming, a software framework is an abstraction in which software providing generic functionality can be selectively changed by user code, thus providing application specific software...

s (Java EE, .NET, Mono
Mono (software)
Mono, pronounced , is a free and open source project led by Xamarin to create an Ecma standard compliant .NET-compatible set of tools including, among others, a C# compiler and a Common Language Runtime....

, SOAP
SOAP
SOAP, originally defined as Simple Object Access Protocol, is a protocol specification for exchanging structured information in the implementation of Web Services in computer networks...

 or REST
Representational State Transfer
Representational state transfer is a style of software architecture for distributed hypermedia systems such as the World Wide Web. The term representational state transfer was introduced and defined in 2000 by Roy Fielding in his doctoral dissertation...

ful Web service
Web service
A Web service is a method of communication between two electronic devices over the web.The W3C defines a "Web service" as "a software system designed to support interoperable machine-to-machine interaction over a network". It has an interface described in a machine-processable format...

s, etc.), business intelligence
Business intelligence
Business intelligence mainly refers to computer-based techniques used in identifying, extracting, and analyzing business data, such as sales revenue by products and/or departments, or by associated costs and incomes....

 (BI), business activity monitoring
Business activity monitoring
Business activity monitoring is software that aids in monitoring of business activities, as those activities are implemented in computer systems....

 (BAM) software, enterprise resource planning
Enterprise resource planning
Enterprise resource planning systems integrate internal and external management information across an entire organization, embracing finance/accounting, manufacturing, sales and service, customer relationship management, etc. ERP systems automate this activity with an integrated software application...

 (ERP), Customer Relationship Management
Customer relationship management
Customer relationship management is a widely implemented strategy for managing a company’s interactions with customers, clients and sales prospects. It involves using technology to organize, automate, and synchronize business processes—principally sales activities, but also those for marketing,...

 (CRM), Business Process Management
Business process management
Business process management is a holistic management approach focused on aligning all aspects of an organization with the wants and needs of clients. It promotes business effectiveness and efficiency while striving for innovation, flexibility, and integration with technology. BPM attempts to...

 (BPM and/or BPEL) Software, and web content management (CMS).

Data access technologies

  • ADO.NET
    ADO.NET
    ADO.NET is a set of computer software components that programmers can use to access data and data services. It is a part of the base class library that is included with the Microsoft .NET Framework. It is commonly used by programmers to access and modify data stored in relational database systems,...

  • JDBC
  • ODBC
  • OLE DB
    OLE DB
    OLE DB is an API designed by Microsoft for accessing data from a variety of sources in an uniform manner. It is a set of interfaces implemented using the Component Object Model ; it is otherwise unrelated to OLE...

  • XQuery
    XQuery
    - Features :XQuery provides the means to extract and manipulate data from XML documents or any data source that can be viewed as XML, such as relational databases or office documents....

  • Service Data Objects
    Service Data Objects
    Service Data Objects is a technology that allows heterogeneous data to be accessed in a uniform way. The SDO specification was originally developed in 2004 as a joint collaboration between BEA and IBM and approved by the Java Community Process...

     (SDO) for Java, C++ and .Net clients and any type of data source

See also

  • Business Intelligence 2.0
    Business Intelligence 2.0
    Business Intelligence 2.0 is a term that refers to new tools and software for business intelligence, beginning in the mid-2000s, that enable, among other things, dynamic querying of real-time corporate data by employees, and a more web- and browser-based approached to such data, as opposed to the...

     (BI 2.0)
  • Data access
    Data access
    Data access typically refers to software and activities related to storing, retrieving, or acting on data housed in a database or other repository...

  • Data integration
    Data integration
    Data integration involves combining data residing in different sources and providing users with a unified view of these data.This process becomes significant in a variety of situations, which include both commercial and scientific domains...

  • Data virtualization
    Data virtualization
    Data virtualization describes the process of abstracting disparate data sources through a single data access layer ....

  • Data Warehouse
    Data warehouse
    In computing, a data warehouse is a database used for reporting and analysis. The data stored in the warehouse is uploaded from the operational systems. The data may pass through an operational data store for additional operations before it is used in the DW for reporting.A data warehouse...

  • Enterprise Application Integration
    Enterprise application integration
    Enterprise Application Integration is defined as the use of software and computer systems architectural principles to integrate a set of enterprise computer applications.- Overview :...

  • Enterprise integration
    Enterprise integration
    Enterprise integration is a technical field of Enterprise Architecture, which focused on the study of topics such as system interconnection, electronic data interchange, product data exchange and distributed computing environments....

  • Federated database system
    Federated database system
    A federated database system is a type of meta-database management system , which transparently integrates multiple autonomous database systems into a single federated database. The constituent databases are interconnected via a computer network and may be geographically decentralized...

  • Resource Description Framework
    Resource Description Framework
    The Resource Description Framework is a family of World Wide Web Consortium specifications originally designed as a metadata data model...

  • Semantic integration
    Semantic integration
    Semantic integration is the process of interrelating information from diverse sources, for example calendars and to do lists; email archives; physical, psychological, and social presence information; documents of all sorts; contacts ; search results; and advertising and marketing relevance derived...

  • Semantic Web
    Semantic Web
    The Semantic Web is a collaborative movement led by the World Wide Web Consortium that promotes common formats for data on the World Wide Web. By encouraging the inclusion of semantic content in web pages, the Semantic Web aims at converting the current web of unstructured documents into a "web of...

  • Web 2.0
    Web 2.0
    The term Web 2.0 is associated with web applications that facilitate participatory information sharing, interoperability, user-centered design, and collaboration on the World Wide Web...

  • Web services
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK