Web bug
Encyclopedia
A web bug is an object that is embedded in a web page
Web page
A web page or webpage is a document or information resource that is suitable for the World Wide Web and can be accessed through a web browser and displayed on a monitor or mobile device. This information is usually in HTML or XHTML format, and may provide navigation to other web pages via hypertext...

 or e-mail
E-mail
Electronic mail, commonly known as email or e-mail, is a method of exchanging digital messages from an author to one or more recipients. Modern email operates across the Internet or other computer networks. Some early email systems required that the author and the recipient both be online at the...

 and is usually invisible to the user but allows checking that a user has viewed the page or e-mail. One common use is in e-mail tracking
E-mail tracking
Email tracking is a method for monitoring the email delivery to intended recipient. Most tracking technologies utilize some form of digitally time-stamped record to reveal the exact time and date that your email was received or opened, as well the IP address of the recipient.Email tracking is...

. Alternative names are web beacon, tracking bug, and tag or page tag. Common names for web bugs implemented through an embedded image include tracking pixel, pixel tag, 1×1 gif, and clear gif
GIF
The Graphics Interchange Format is a bitmap image format that was introduced by CompuServe in 1987 and has since come into widespread usage on the World Wide Web due to its wide support and portability....

.

Overview

A web bug is any one of a number of techniques used to track who is reading a web page or e-mail, when, and from what computer. They can also be used to see if an e-mail was read or forwarded to someone else, or if a web page was copied to another website. The first web bugs were small images.

Some e-mails and web pages are not wholly self-contained. They may refer to content on another server
Server (computing)
In the context of client-server architecture, a server is a computer program running to serve the requests of other programs, the "clients". Thus, the "server" performs some computational task on behalf of "clients"...

, rather than including the content directly. When an e-mail client
E-mail client
An email client, email reader, or more formally mail user agent , is a computer program used to manage a user's email.The term can refer to any system capable of accessing the user's email mailbox, regardless of it being a mail user agent, a relaying server, or a human typing on a terminal...

 or web browser
Web browser
A web browser is a software application for retrieving, presenting, and traversing information resources on the World Wide Web. An information resource is identified by a Uniform Resource Identifier and may be a web page, image, video, or other piece of content...

 prepares such an e-mail or web page for display, it ordinarily sends a request to the server to send the additional content.

These requests typically include the IP address
IP address
An Internet Protocol address is a numerical label assigned to each device participating in a computer network that uses the Internet Protocol for communication. An IP address serves two principal functions: host or network interface identification and location addressing...

 of the requesting computer, the time the content was requested, the type of web browser that made the request, and the existence of cookies
HTTP cookie
A cookie, also known as an HTTP cookie, web cookie, or browser cookie, is used for an origin website to send state information to a user's browser and for the browser to return the state information to the origin site...

 previously set by that server. The server can store all of this information, and associate it with a unique tracking token attached to the content request.

On web pages

Web bugs are typically used by third parties to monitor the activity of customers at a site.

As an example of the way web bugs can make user logging easier, consider a company that owns a network of sites. This company may have a network that requires all images to be stored on one host computer while the pages themselves are stored elsewhere. They could use web bugs in order to count and recognize users traveling around the different servers on the network. Rather than gathering statistics and managing cookies on all their servers separately, they can use web bugs to keep them all together.

Tracking on web pages can be disabled using a number of techniques.
  • Turning off a browser's cookies can prevent some web bugs from tracking a customer's specific activity. The web site logs will still record a page request from the customer's IP address, but unique information associated with a cookie cannot be recorded. However, web site server techniques that do not use cookies can be employed to help track a site's cookie-blocking users. For example, a web site can identify a request from a new visitor and send that visitor links that pass a unique ID as a GET parameter.

  • Browser add-ons and extensions can be used. For example, the Ghostery add-on analyzes Java Script to detect trackers, web bugs, pixels, and beacons.

In e-mail

Web bugs are frequently used in spamming (sending unsolicited commercial e-mail) as a way of "pinging" to find which spam recipients open (and presumably read) before deleting it.

Tracking in e-mail can be disabled by:
  • Many web bugs can be avoided by turning off HTML display and displaying only the text.
  • Turning off the display of images while still using HTML may still allow other techniques to be used.

Implementation

Originally, a web bug was a small (usually 1×1 pixel
Pixel
In digital imaging, a pixel, or pel, is a single point in a raster image, or the smallest addressable screen element in a display device; it is the smallest unit of picture that can be represented or controlled....

) transparent GIF or PNG image (or an image of the same color of the background) that was embedded in an HTML
HTML
HyperText Markup Language is the predominant markup language for web pages. HTML elements are the basic building-blocks of webpages....

 page, usually a page on the web or the content of an e-mail
E-mail
Electronic mail, commonly known as email or e-mail, is a method of exchanging digital messages from an author to one or more recipients. Modern email operates across the Internet or other computer networks. Some early email systems required that the author and the recipient both be online at the...

. Modern web bugs also use the HTML IFrame, style, script, input link, embed, object, and other tags to track usage. Whenever the user opens the page with a graphical browser
Web browser
A web browser is a software application for retrieving, presenting, and traversing information resources on the World Wide Web. An information resource is identified by a Uniform Resource Identifier and may be a web page, image, video, or other piece of content...

 or e-mail reader
E-mail client
An email client, email reader, or more formally mail user agent , is a computer program used to manage a user's email.The term can refer to any system capable of accessing the user's email mailbox, regardless of it being a mail user agent, a relaying server, or a human typing on a terminal...

, the image or other information is downloaded. This download requires the browser to request the image from the server
Server (computing)
In the context of client-server architecture, a server is a computer program running to serve the requests of other programs, the "clients". Thus, the "server" performs some computational task on behalf of "clients"...

 storing it, allowing the server to take notice of the download. As a result, the organization running the server is informed when the HTML page has been viewed.

The image or other content does not have to be invisible: any element requested from the third party can be used for tracking. Typically advertisements, banners and buttons are fetched from the site to which they are connected, not from the servers of the main content. This gives the external site information about visitors of the site including these on their pages. Companies or organisations, buttons or images of which are included on many sites, can thus track (part of) the browsing habits of a significant share of web users. Earlier this included mainly ad or counter serving companies, nowadays buttons of social media sites are becoming common.

While web bugs are used in the same way in web pages or e-mails, they have different purposes:
  1. If the bug is embedded in an e-mail, the image is requested when the user reads the e-mail for the first time, and can also be requested every time that the user subsequently loads the e-mail;
  2. Whenever a web page (with or without bugs) is downloaded, the server holding the page knows and can store the IP address
    IP address
    An Internet Protocol address is a numerical label assigned to each device participating in a computer network that uses the Internet Protocol for communication. An IP address serves two principal functions: host or network interface identification and location addressing...

     of the computer requesting the page; this information can therefore be retrieved from the server log file
    Log file
    The term log file can refer to:*Text saved by a computer operating system to recored its activities, such as by the Unix syslog facility*Output produced by a data loggerAlso see Wikibooks chapter...

    s without the need of using bugs. Bugs are used when the monitoring party does not have easy access to the logs of the main web server. This may be, for example, because the web site owner does not control the web servers (web hotels) or because monitoring is done by a third party.


As with all files transferred using the Hypertext Transfer Protocol
Hypertext Transfer Protocol
The Hypertext Transfer Protocol is a networking protocol for distributed, collaborative, hypermedia information systems. HTTP is the foundation of data communication for the World Wide Web....

, web bugs are requested by sending the server their URL
Uniform Resource Locator
In computing, a uniform resource locator or universal resource locator is a specific character string that constitutes a reference to an Internet resource....

, and possibly the URL of the page containing them. Both URLs contain information that can be useful for the server:
  1. The URL of the page containing the bug allows the server to determine which particular web page the user has accessed;
  2. The URL of the bug can be appended with an arbitrary string in various ways while still identifying the same object; this extra information can be used to better identify the conditions under which the bug has been loaded; this extra information can be added while sending the page or by JavaScript
    JavaScript
    JavaScript is a prototype-based scripting language that is dynamic, weakly typed and has first-class functions. It is a multi-paradigm language, supporting object-oriented, imperative, and functional programming styles....

     scripts after the download.


For example, an e-mail sent to the address somebody@example.org can contain the embedded image of URL http://example.com/bug.gif?somebody@example.org. Whenever the user reads the e-mail, the image at this URL is requested. The part of the URL after the question mark is ignored by the server for the purpose of determining which file to send, but the complete URL is stored in the server's log file. As a result, the file bug.gif is sent and shown in the e-mail reader; at the same time, the server stores the fact that the particular e-mail sent to somebody@example.org has been read. Using this system, a spammer
Spam (electronic)
Spam is the use of electronic messaging systems to send unsolicited bulk messages indiscriminately...

 or e-mail marketer can send similar e-mails to a large number of addresses to check which ones are valid and read by the users.

Web bugs can be used in combination with HTTP cookie
HTTP cookie
A cookie, also known as an HTTP cookie, web cookie, or browser cookie, is used for an origin website to send state information to a user's browser and for the browser to return the state information to the origin site...

s like any other object transferred using the HTTP protocol.

E-mail web bugs

Web bugs embedded in e-mails have greater privacy implications than bugs embedded in web pages. Through the use of unique identifiers contained in the URL of the web bugs, the sender of an e-mail containing a web bug is able to record the exact time that a message was read, as well as the IP address
IP address
An Internet Protocol address is a numerical label assigned to each device participating in a computer network that uses the Internet Protocol for communication. An IP address serves two principal functions: host or network interface identification and location addressing...

 of the computer used to read the mail or the proxy server
Proxy server
In computer networks, a proxy server is a server that acts as an intermediary for requests from clients seeking resources from other servers. A client connects to the proxy server, requesting some service, such as a file, connection, web page, or other resource available from a different server...

 that the user went through. In this way, the sender can gather detailed information about when and where each particular recipient reads e-mail. Every subsequent time the e-mail message is displayed can also send information back to the sender.

Web bugs are used by e-mail marketer
E-mail marketing
Email marketing is a form of direct marketing which uses email as a means of communicating commercial or fund-raising messages to an audience. In its broadest sense, every email sent to a potential or current customer could be considered email marketing...

s, spammers
Spam (electronic)
Spam is the use of electronic messaging systems to send unsolicited bulk messages indiscriminately...

, and phishers
Phishing
Phishing is a way of attempting to acquire information such as usernames, passwords, and credit card details by masquerading as a trustworthy entity in an electronic communication. Communications purporting to be from popular social web sites, auction sites, online payment processors or IT...

 to verify that e-mail addresses are valid, that the content of e-mails has made it past the spam filters, and that the e-mail is actually viewed by users. When the user reads the e-mail, the e-mail client requests the image, letting the sender know that the e-mail address is valid and that e-mail was viewed. The e-mail need not contain an advertisement or anything else related to the commercial activity of the sender. This makes detection of such e-mails harder for mail filters and users.

Tracking via web bugs can be prevented by using e-mail client
E-mail client
An email client, email reader, or more formally mail user agent , is a computer program used to manage a user's email.The term can refer to any system capable of accessing the user's email mailbox, regardless of it being a mail user agent, a relaying server, or a human typing on a terminal...

s that do not download images whose URLs are embedded in HTML e-mails. Many graphical e-mail clients can be configured to avoid accessing remote images. Examples include the Gmail
Gmail
Gmail is a free, advertising-supported email service provided by Google. Users may access Gmail as secure webmail, as well via POP3 or IMAP protocols. Gmail was launched as an invitation-only beta release on April 1, 2004 and it became available to the general public on February 7, 2007, though...

, Yahoo!
Yahoo!
Yahoo! Inc. is an American multinational internet corporation headquartered in Sunnyvale, California, United States. The company is perhaps best known for its web portal, search engine , Yahoo! Directory, Yahoo! Mail, Yahoo! News, Yahoo! Groups, Yahoo! Answers, advertising, online mapping ,...

, and SpamCop
SpamCop
SpamCop is a free spam reporting service, allowing recipients of unsolicited bulk email and unsolicited commercial email to report offenders to the senders' Internet Service Providers , and sometimes their web hosts...

/Horde
Horde (software)
Horde is a PHP-based Web application framework.It offers applications such as the Horde IMP email client, a groupware package , a wiki and a time and task tracking software.-Horde Email Platform:...

 webmail clients; Mozilla Thunderbird
Mozilla Thunderbird
Mozilla Thunderbird is a free, open source, cross-platform e-mail and news client developed by the Mozilla Foundation. The project strategy is modeled after Mozilla Firefox, a project aimed at creating a web browser...

, Opera
Opera (web browser)
Opera is a web browser and Internet suite developed by Opera Software with over 200 million users worldwide. The browser handles common Internet-related tasks such as displaying web sites, sending and receiving e-mail messages, managing contacts, chatting on IRC, downloading files via BitTorrent,...

, Pegasus Mail
Pegasus Mail
Pegasus Mail is a donationware , proprietary, email client that is developed and maintained by David Harris and his team. It was originally released in 1990 for internal and external mail on Netware networks with MS-DOS clients, and was subsequently ported to Microsoft Windows...

, IncrediMail, later versions of Microsoft Outlook
Microsoft Outlook
Microsoft Outlook is a personal information manager from Microsoft, available both as a separate application as well as a part of the Microsoft Office suite...

, and KMail mail readers. Other HTML techniques (such as IFrames) can still be used to track e-mail viewing.

Text-based mail readers (such as Pine
Pine (e-mail client)
Pine is a freeware, text-based email client developed at the University of Washington. The first version of this client was written in 1989. Source code was available for only the Unix version under a license written by the University of Washington...

 or Mutt
Mutt (e-mail client)
Mutt is a text-based email client for Unix-like systems. It was originally written by Michael Elkins in 1995 and released under the GNU General Public License version 2 or any later version....

) and graphical e-mail client
E-mail client
An email client, email reader, or more formally mail user agent , is a computer program used to manage a user's email.The term can refer to any system capable of accessing the user's email mailbox, regardless of it being a mail user agent, a relaying server, or a human typing on a terminal...

s with purely text-based HTML capabilities (such as Mulberry
Mulberry (e-mail client)
Mulberry is a formerly proprietary, now open sourced email client marketed by Cyrusoft from approximately 1995 to 2005. On October 1, 2005, Cyrusoft International, Inc./ISAMET, declared Chapter 7 bankruptcy and went out of business...

) do not interpret HTML or display images, so their users are not subject to tracking by e-mail web bugs. Plain-text e-mail messages cannot contain web bugs because their contents are interpreted as display characters instead of embedded HTML code, so opening messages does not initiate communication. Some e-mail clients offer the option to disable all HTML in every message (thus rendering all messages as plain text), which prevents any web bugs from loading.

Many modern e-mail readers and web-based e-mail services will not load images when opening an HTML e-mail from an unknown sender or that is suspected to be spam mail. The user must explicitly choose to load images. Web bugs can also be filtered out at the server level so that they never reach the end user. MailScanner
MailScanner
MailScanner is an open source e-mail security system for use on Unix e-mail gateways and was first released in 2001. It protects against viruses and spam...

 is an example of gateway software that can disarm IFrames as well as web bugs. Momentarily disabling a computer's Internet connection before reading new emails and deleting those messages suspicious of containing web bugs may also eliminate the threat.

A hosts file
Hosts file
The hosts file is a computer file used in an operating system to map hostnames to IP addresses. The hosts file is a plain-text file and is conventionally named hosts.-Purpose:...

 or a filtering web proxy can be used to specify that some servers are never to be contacted for any reason. This file must be continually updated to reflect the fact that new tracking servers are periodically brought online, and old ones repurposed to serve legitimate content.

As web bugs require the e-mail software to fetch the content they have never been able to accurately count read rates for e-mail campaigns. As a result of the above mentioned measures, they might get still less effective.

Disposition-Notification-To email headers may be seen as another form of web bug. See RFC 4021.

See also

  • Facebook Beacon
    Facebook Beacon
    Beacon was a part of Facebook's advertisement system that sent data from external websites to Facebook, for the purpose of allowing targeted advertisements and allowing users to share their activities with their friends. Certain activities on partner sites were published to a user's News Feed....

  • E-mail fraud
    E-mail fraud
    Email fraud is the intentional deception made for personal gain or to damage another individual through email. Almost as soon as email became widely used, it began to be used as a means to defraud people. Email fraud can take the form of a "con game" or scam...

  • Internet privacy
    Internet privacy
    Internet privacy involves the right or mandate of personal privacy concerning the storing, repurposing, providing to third-parties, and displaying of information pertaining to oneself via the Internet. Privacy can entail both Personally Identifying Information or non-PII information such as a...

  • Web visitor tracking
    Web visitor tracking
    Web visitor tracking is the analysis of visitor behaviour on a website. Analysis of an individual visitor's behaviour may be used to provide that visitor with options or content that relates to their implied preferences; either during a visit or in the future...

  • Web analytics
    Web analytics
    Web analytics is the measurement, collection, analysis and reporting of internet data for purposes of understanding and optimizing web usage....


External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK