Open Market For Internet Content Accessibility
Encyclopedia
Open Market For Internet Content Accessibility - OMFICA is a non-profit organization
with the mission
to develop competitive market for web search. OMFICA has created data repository
of World Wide Web
content
and set-up its governance
based on democratic principles
. Data repository is updated by distributed crawlers, which use free resources of volunteers' PCs
.
The World Wide Web
is a huge source of information generated by people who use it. However, this data is chaotically spread or decentralized in the wide net of the web. In order the data to be available for the Web users, it should be collected and centralized in storage
. Centralization itself carries a threat of monopolization
of the information, and when it is monopolized, it stops to be openly available for the public. A very good remedy for this could be a centralization of the data with its democratic availability, that is the data centralized but not monopolized and it becomes available and controllable for all the Web users interested in it. This idea is carried by OMFICA.
in San Francisco (April 22-25, 2008), and joining to WWW Consortium (May 24, 2008).
to develop own or deploy third party technologies for the creation of OMFICA Data Repository
, which enables to integrate data about publicly available websites’
structure, pages’
actual content
, visit statistics and pages’ semantic analysis results.
Activities carried out by OMFICA could be logically separated into following subgroups:
OMFICA makes it possible for Web users, Web companies, web masters, website publishers and text analyzing service providers to get involved in creation of integrated Global World Wide Web
Intelligent Data Repository.
users and companies. OMFICA’s activities are carried out by its four committees: Business Committee, Trustees Committee, Technical Committee, and Content Committee. Either of these committees delegates three Directors elected online
from its members into a Board of Directors
- OMFICA's key administrative body.
Non-profit organization
Nonprofit organization is neither a legal nor technical definition but generally refers to an organization that uses surplus revenues to achieve its goals, rather than distributing them as profit or dividends...
with the mission
Mission statement
A mission statement is a statement of the purpose of a company or organization. The mission statement should guide the actions of the organization, spell out its overall goal, provide a path, and guide decision-making...
to develop competitive market for web search. OMFICA has created data repository
Digital library
A digital library is a library in which collections are stored in digital formats and accessible by computers. The digital content may be stored locally, or accessed remotely via computer networks...
of World Wide Web
World Wide Web
The World Wide Web is a system of interlinked hypertext documents accessed via the Internet...
content
Web content
Web content is the textual, visual or aural content that is encountered as part of the user experience on websites. It may include, among other things: text, images, sounds, videos and animations....
and set-up its governance
Governance
Governance is the act of governing. It relates to decisions that define expectations, grant power, or verify performance. It consists of either a separate process or part of management or leadership processes...
based on democratic principles
Democracy
Democracy is generally defined as a form of government in which all adult citizens have an equal say in the decisions that affect their lives. Ideally, this includes equal participation in the proposal, development and passage of legislation into law...
. Data repository is updated by distributed crawlers, which use free resources of volunteers' PCs
Personal computer
A personal computer is any general-purpose computer whose size, capabilities, and original sales price make it useful for individuals, and which is intended to be operated directly by an end-user with no intervening computer operator...
.
The World Wide Web
World Wide Web
The World Wide Web is a system of interlinked hypertext documents accessed via the Internet...
is a huge source of information generated by people who use it. However, this data is chaotically spread or decentralized in the wide net of the web. In order the data to be available for the Web users, it should be collected and centralized in storage
Data storage device
thumb|200px|right|A reel-to-reel tape recorder .The magnetic tape is a data storage medium. The recorder is data storage equipment using a portable medium to store the data....
. Centralization itself carries a threat of monopolization
Monopolization
The term monopolization refers to an offense under Section 2 of the American Sherman Antitrust Act, passed in 1890. Section 2 states that any person "who shall monopolize . ....
of the information, and when it is monopolized, it stops to be openly available for the public. A very good remedy for this could be a centralization of the data with its democratic availability, that is the data centralized but not monopolized and it becomes available and controllable for all the Web users interested in it. This idea is carried by OMFICA.
Start-up
Being incorporated in February 2008, OMFICA starts its activities by participating in WWW important events, such as Web 2.0 ExpoWeb 2.0
The term Web 2.0 is associated with web applications that facilitate participatory information sharing, interoperability, user-centered design, and collaboration on the World Wide Web...
in San Francisco (April 22-25, 2008), and joining to WWW Consortium (May 24, 2008).
Activities
OMFICA implements a strategyStrategy
Strategy, a word of military origin, refers to a plan of action designed to achieve a particular goal. In military usage strategy is distinct from tactics, which are concerned with the conduct of an engagement, while strategy is concerned with how different engagements are linked...
to develop own or deploy third party technologies for the creation of OMFICA Data Repository
Digital library
A digital library is a library in which collections are stored in digital formats and accessible by computers. The digital content may be stored locally, or accessed remotely via computer networks...
, which enables to integrate data about publicly available websites’
Website
A website, also written as Web site, web site, or simply site, is a collection of related web pages containing images, videos or other digital assets. A website is hosted on at least one web server, accessible via a network such as the Internet or a private local area network through an Internet...
structure, pages’
Web page
A web page or webpage is a document or information resource that is suitable for the World Wide Web and can be accessed through a web browser and displayed on a monitor or mobile device. This information is usually in HTML or XHTML format, and may provide navigation to other web pages via hypertext...
actual content
Web content
Web content is the textual, visual or aural content that is encountered as part of the user experience on websites. It may include, among other things: text, images, sounds, videos and animations....
, visit statistics and pages’ semantic analysis results.
Activities carried out by OMFICA could be logically separated into following subgroups:
- Continuously increasing and keeping up to date the Website Parse TemplateWebsite Parse TemplateWebsite Parse Template is an XML-based open format which provides HTML structure description of website pages. WPT format allows web crawlers to generate Semantic Web’s RDFs for web pages...
s RepositorySoftware repositoryA software repository is a storage location from which software packages may be retrieved and installed on a computer.- Discussion :Many software publishers and other organizations maintain servers on the Internet for this purpose, either free of charge or for a subscription fee...
. - Management of OMFICA’s Distributed Web CrawlingICDL crawlingICDL crawling is an open distributed web crawling technology based on Website Parse Template .- What is Website Parse Template? :Website Parse Template is an XML based open format which provides HTML structure description of Web pages. The WPT format allows web crawlers to generate Semantic Web’s...
process and storing web pageWeb pageA web page or webpage is a document or information resource that is suitable for the World Wide Web and can be accessed through a web browser and displayed on a monitor or mobile device. This information is usually in HTML or XHTML format, and may provide navigation to other web pages via hypertext...
parsed data in OMFICA Data RepositoryDigital libraryA digital library is a library in which collections are stored in digital formats and accessible by computers. The digital content may be stored locally, or accessed remotely via computer networks...
. - Storing and processing of websiteWebsiteA website, also written as Web site, web site, or simply site, is a collection of related web pages containing images, videos or other digital assets. A website is hosted on at least one web server, accessible via a network such as the Internet or a private local area network through an Internet...
pagesWeb pageA web page or webpage is a document or information resource that is suitable for the World Wide Web and can be accessed through a web browser and displayed on a monitor or mobile device. This information is usually in HTML or XHTML format, and may provide navigation to other web pages via hypertext...
visit statisticsWeb analyticsWeb analytics is the measurement, collection, analysis and reporting of internet data for purposes of understanding and optimizing web usage....
. - Generating digital libraryDigital libraryA digital library is a library in which collections are stored in digital formats and accessible by computers. The digital content may be stored locally, or accessed remotely via computer networks...
snapshotsSnapshot (computer storage)In computer systems, a snapshot is the state of a system at a particular point in time. The term was coined as an analogy to that in photography. It can refer to an actual copy of the state of a system or to a capability provided by certain systems....
and daily updates as filesComputer fileA computer file is a block of arbitrary information, or resource for storing information, which is available to a computer program and is usually based on some kind of durable storage. A file is durable in the sense that it remains available for programs to use after the current program has finished...
available for FTP downloading.
OMFICA makes it possible for Web users, Web companies, web masters, website publishers and text analyzing service providers to get involved in creation of integrated Global World Wide Web
World Wide Web
The World Wide Web is a system of interlinked hypertext documents accessed via the Internet...
Intelligent Data Repository.
Company Structure
OMFICA is governed by its members - internetInternet
The Internet is a global system of interconnected computer networks that use the standard Internet protocol suite to serve billions of users worldwide...
users and companies. OMFICA’s activities are carried out by its four committees: Business Committee, Trustees Committee, Technical Committee, and Content Committee. Either of these committees delegates three Directors elected online
ONLINE
ONLINE is a magazine for information systems first published in 1977. The publisher Online, Inc. was founded the year before. In May 2002, Information Today, Inc. acquired the assets of Online Inc....
from its members into a Board of Directors
Board of directors
A board of directors is a body of elected or appointed members who jointly oversee the activities of a company or organization. Other names include board of governors, board of managers, board of regents, board of trustees, and board of visitors...
- OMFICA's key administrative body.
See also
- Search engineSearch engineA search engine is an information retrieval system designed to help find information stored on a computer system. The search results are usually presented in a list and are commonly called hits. Search engines help to minimize the time required to find information and the amount of information...
- ICDL crawlingICDL crawlingICDL crawling is an open distributed web crawling technology based on Website Parse Template .- What is Website Parse Template? :Website Parse Template is an XML based open format which provides HTML structure description of Web pages. The WPT format allows web crawlers to generate Semantic Web’s...
- Website Parse TemplateWebsite Parse TemplateWebsite Parse Template is an XML-based open format which provides HTML structure description of website pages. WPT format allows web crawlers to generate Semantic Web’s RDFs for web pages...
- Web indexingWeb indexingWeb indexing includes back-of-book-style indexes to individual websites or an intranet, and the creation of keyword metadata to provide a more useful vocabulary for Internet or onsite search engines...
- Web crawlerWeb crawlerA Web crawler is a computer program that browses the World Wide Web in a methodical, automated manner or in an orderly fashion. Other terms for Web crawlers are ants, automatic indexers, bots, Web spiders, Web robots, or—especially in the FOAF community—Web scutters.This process is called Web...