CLEVER project
Encyclopedia
The CLEVER project was a research project in Web search led by Jon Kleinberg
at IBM
's Almaden Research Center
.
Techniques developed in CLEVER included various forms of link analysis
, including the HITS algorithm
.
The CLEVER search engine incorporates several algorithms that make use of the Web's hyperlink structure for discovering high-quality information. It can be exceedingly difficult to locate resources on the World Wide Web that are both high-quality and relevant to a user's informational needs. Traditional automated search methods for locating information on the Web are easily overwhelmed by low-quality and unrelated content. Second generation search engines have to have effective methods for focusing on the most authoritative documents. The rich structure implicit in hyperlinks among Web documents offers a simple, and effective, means to deal with many of these problems.
Members of the Clever project have come up with a mathematical algorithm that views the Net as simply web pages pointing at each other. It also takes into account the notion of hubs, which point to quality content and link information together, and the idea of authority pages, which are often written by specialists in certain fields.
Bill Cody. Senior manager of exploratory data management research at IBM's Almaden Research Center, said: "Web searches provide a lot of information, some good, some bad. But the people providing good hubs usually point to authority pages and authority pages generally know of good hubs. The algorithm enables us to find them and so provide users with quality information rather than the regular list of irrelevant web pages."
He added that the algorithm had also been used to find Internet based communities using the same principle of finding links between like and like.
Some 18 months ago, it was used to discover 300,000 communities worldwide, only four per cent of which turned out to be spurious. About two thirds of these still existed, he claimed, with about half now appearing on Yahoo as mature communities.
Cody added such a tool could potentially be used for targeted advertising purposes or for enabling users to find out more information about insipient communities, but declined to say whether IBM had plans to turn Clever into a commercial product or not.
Jon Kleinberg
-External links:**** Stephen Ibaraki*Yury Lifshits,...
at IBM
IBM
International Business Machines Corporation or IBM is an American multinational technology and consulting corporation headquartered in Armonk, New York, United States. IBM manufactures and sells computer hardware and software, and it offers infrastructure, hosting and consulting services in areas...
's Almaden Research Center
Almaden Research Center
The IBM Almaden Research Center is in San Jose, California, and is one of IBM's nine worldwide research labs. Its scientists perform basic and applied research in computer science, services, storage systems, physical sciences, and materials science and technology. The center opened in 1986, and...
.
Techniques developed in CLEVER included various forms of link analysis
Link Analysis
In network theory, link analysis is a data-analysis technique used to evaluate relationships between nodes. Relationships may be identified among various types of nodes , including organizations, people and transactions...
, including the HITS algorithm
HITS algorithm
Hyperlink-Induced Topic Search is a link analysis algorithm that rates Web pages, developed by Jon Kleinberg. It was a precursor to PageRank...
.
The CLEVER search engine incorporates several algorithms that make use of the Web's hyperlink structure for discovering high-quality information. It can be exceedingly difficult to locate resources on the World Wide Web that are both high-quality and relevant to a user's informational needs. Traditional automated search methods for locating information on the Web are easily overwhelmed by low-quality and unrelated content. Second generation search engines have to have effective methods for focusing on the most authoritative documents. The rich structure implicit in hyperlinks among Web documents offers a simple, and effective, means to deal with many of these problems.
Members of the Clever project have come up with a mathematical algorithm that views the Net as simply web pages pointing at each other. It also takes into account the notion of hubs, which point to quality content and link information together, and the idea of authority pages, which are often written by specialists in certain fields.
Bill Cody. Senior manager of exploratory data management research at IBM's Almaden Research Center, said: "Web searches provide a lot of information, some good, some bad. But the people providing good hubs usually point to authority pages and authority pages generally know of good hubs. The algorithm enables us to find them and so provide users with quality information rather than the regular list of irrelevant web pages."
He added that the algorithm had also been used to find Internet based communities using the same principle of finding links between like and like.
Some 18 months ago, it was used to discover 300,000 communities worldwide, only four per cent of which turned out to be spurious. About two thirds of these still existed, he claimed, with about half now appearing on Yahoo as mature communities.
Cody added such a tool could potentially be used for targeted advertising purposes or for enabling users to find out more information about insipient communities, but declined to say whether IBM had plans to turn Clever into a commercial product or not.