Page hijacking
Encyclopedia
Page hijacking is a form of search engine
index spamming
. It is achieved by creating a rogue copy of a popular website which shows contents similar to the original to a web crawler
, but redirects
web surfers to unrelated or malicious websites. Spammers can use this technique to achieve high rankings in result pages for certain key words.
Page hijacking is a form of cloaking
, made possible because some web crawler
s detect duplicates while indexing web pages. If two pages have the same content, only one of the URL
s will be kept. A spammer will try to ensure that the rogue website is the one shown on the result pages.
In some cases, legitimate web pages can be edited by external advertisers via XSS
and redirected to promoting web site.
A spammer working for a competing company then creates a website that looks extremely similar to the one listed when visited by a web crawler. However, it includes a special redirection script that redirects regular web surfers to the competitor's site. After several weeks, a web search for really big t-shirts then shows the following result:
Notice how .com changed to .net, as well as the new "Show Similar Pages" link.
When web surfers click on this result, they are redirected to the competing website. The original result was hidden in the "Show Similar Pages" section.
Search engine
A search engine is an information retrieval system designed to help find information stored on a computer system. The search results are usually presented in a list and are commonly called hits. Search engines help to minimize the time required to find information and the amount of information...
index spamming
Spamdexing
In computing, spamdexing is the deliberate manipulation of search engine indexes...
. It is achieved by creating a rogue copy of a popular website which shows contents similar to the original to a web crawler
Web crawler
A Web crawler is a computer program that browses the World Wide Web in a methodical, automated manner or in an orderly fashion. Other terms for Web crawlers are ants, automatic indexers, bots, Web spiders, Web robots, or—especially in the FOAF community—Web scutters.This process is called Web...
, but redirects
URL redirection
URL redirection, also called URL forwarding and the very similar technique domain redirection also called domain forwarding, are techniques on the World Wide Web for making a web page available under many URLs.- Similar domain names :...
web surfers to unrelated or malicious websites. Spammers can use this technique to achieve high rankings in result pages for certain key words.
Page hijacking is a form of cloaking
Cloaking
Cloaking is a search engine optimization technique in which the content presented to the search engine spider is different from that presented to the user's browser. This is done by delivering content based on the IP addresses or the User-Agent HTTP header of the user requesting the page...
, made possible because some web crawler
Web crawler
A Web crawler is a computer program that browses the World Wide Web in a methodical, automated manner or in an orderly fashion. Other terms for Web crawlers are ants, automatic indexers, bots, Web spiders, Web robots, or—especially in the FOAF community—Web scutters.This process is called Web...
s detect duplicates while indexing web pages. If two pages have the same content, only one of the URL
Uniform Resource Locator
In computing, a uniform resource locator or universal resource locator is a specific character string that constitutes a reference to an Internet resource....
s will be kept. A spammer will try to ensure that the rogue website is the one shown on the result pages.
In some cases, legitimate web pages can be edited by external advertisers via XSS
Cross-site scripting
Cross-site scripting is a type of computer security vulnerability typically found in Web applications that enables attackers to inject client-side script into Web pages viewed by other users. A cross-site scripting vulnerability may be used by attackers to bypass access controls such as the same...
and redirected to promoting web site.
Example of page hijacking
Suppose that a website offers difficult-to-find sizes of clothes. A common search entered to reach this website is really big t-shirts, which - when entered on popular search engines - made this website show up as the first result:- SpecialClothes
- Offering clothes in sizes you cannot find elsewhere.
- www.example.com/
A spammer working for a competing company then creates a website that looks extremely similar to the one listed when visited by a web crawler. However, it includes a special redirection script that redirects regular web surfers to the competitor's site. After several weeks, a web search for really big t-shirts then shows the following result:
- SpecialClothes
- Offering clothes in sizes you cannot find elsewhere... at better prices!
- www.example.net/
- —Show Similar Pages—
Notice how .com changed to .net, as well as the new "Show Similar Pages" link.
When web surfers click on this result, they are redirected to the competing website. The original result was hidden in the "Show Similar Pages" section.
See also
- Google bombGoogle bombThe terms Google bomb and Googlewashing refer to practices, such as creating large numbers of links, that cause a web page to have a high ranking for searches on unrelated or off topic keyword phrases, often for comical or satirical purposes...
- Homepage hijacking
- Link farmLink farmOn the World Wide Web, a link farm is any group of web sites that all hyperlink to every other site in the group. Although some link farms can be created by hand, most are created through automated programs and services. A link farm is a form of spamming the index of a search engine...
- MousetrappingMousetrappingMousetrapping is a technique used by some websites to keep visitors from leaving their website, either by launching an endless series of pop-up ads—known colloquially as a circle jerk—or by re-launching their website in a window that cannot be closed Mousetrapping is a technique used by some...
- SpamdexingSpamdexingIn computing, spamdexing is the deliberate manipulation of search engine indexes...
- TrustRankTrustRankTrustRank is a link analysis technique described in a paper by Stanford University and Yahoo! researchers for semi-automatically separating useful webpages from spam.Many Web spam pages are created only with the intention of misleading search engines...
External links
- AIRWeb' 05: First Workshop on Adversarial Information Retrieval on the Web - Research on search engine spamming