Forum spam
Encyclopedia
Forum spam is the creating of messages that are advertisements, abusive, or otherwise unwanted on Internet forum
Internet forum
An Internet forum, or message board, is an online discussion site where people can hold conversations in the form of posted messages. They differ from chat rooms in that messages are at least temporarily archived...

s. It is generally done by automated spambots.

Types of spam

Forum spambots surf the web, looking for guestbooks, wiki
Wiki
A wiki is a website that allows the creation and editing of any number of interlinked web pages via a web browser using a simplified markup language or a WYSIWYG text editor. Wikis are typically powered by wiki software and are often used collaboratively by multiple users. Examples include...

s, blog
Blog
A blog is a type of website or part of a website supposed to be updated with new content from time to time. Blogs are usually maintained by an individual with regular entries of commentary, descriptions of events, or other material such as graphics or video. Entries are commonly displayed in...

s, forums and any other web forms
Form (web)
A webform on a web page allows a user to enter data that is sent to a server for processing. Webforms resemble paper or database forms because internet users fill out the forms using checkboxes, radio buttons, or text fields...

 to submit spam links to. These spambots often use OCR
Optical character recognition
Optical character recognition, usually abbreviated to OCR, is the mechanical or electronic translation of scanned images of handwritten, typewritten or printed text into machine-encoded text. It is widely used to convert books and documents into electronic files, to computerize a record-keeping...

 technology to bypass CAPTCHA
CAPTCHA
A CAPTCHA is a type of challenge-response test used in computing as an attempt to ensure that the response is generated by a person. The process usually involves one computer asking a user to complete a simple test which the computer is able to generate and grade...

s present. Some spam messages are targeted towards readers and can involve techniques of target market
Target market
A target market is a group of customers that the business has decided to aim its marketing efforts and ultimately its merchandise. A well-defined target market is the first element to a marketing strategy...

ing or even phishing
Phishing
Phishing is a way of attempting to acquire information such as usernames, passwords, and credit card details by masquerading as a trustworthy entity in an electronic communication. Communications purporting to be from popular social web sites, auction sites, online payment processors or IT...

, making it hard to tell real posts from the bot generated ones. Not all of the spam posts are meant for the readers; some spam messages are simply hyperlink
Hyperlink
In computing, a hyperlink is a reference to data that the reader can directly follow, or that is followed automatically. A hyperlink points to a whole document or to a specific element within a document. Hypertext is text with hyperlinks...

s intended to boost search engine ranking.

Most forum spam consists of links to external sites, with the dual goals of increasing search engine
Search engine
A search engine is an information retrieval system designed to help find information stored on a computer system. The search results are usually presented in a list and are commonly called hits. Search engines help to minimize the time required to find information and the amount of information...

 visibility in highly competitive areas such as weight loss
Weight loss
Weight loss, in the context of medicine, health or physical fitness, is a reduction of the total body mass, due to a mean loss of fluid, body fat or adipose tissue and/or lean mass, namely bone mineral deposits, muscle, tendon and other connective tissue...

, pharmaceuticals, gambling
Gambling
Gambling is the wagering of money or something of material value on an event with an uncertain outcome with the primary intent of winning additional money and/or material goods...

, pornography
Pornography
Pornography or porn is the explicit portrayal of sexual subject matter for the purposes of sexual arousal and erotic satisfaction.Pornography may use any of a variety of media, ranging from books, magazines, postcards, photos, sculpture, drawing, painting, animation, sound recording, film, video,...

, real estate
Real estate
In general use, esp. North American, 'real estate' is taken to mean "Property consisting of land and the buildings on it, along with its natural resources such as crops, minerals, or water; immovable property of this nature; an interest vested in this; an item of real property; buildings or...

 or loans, and generating more traffic for these commercial websites. Some of these links contain code to track the spambot's identity if a sale goes through, when the spammer behind the spambot works on commission.

Spam posts may contain anything from a single link, to dozens of links. Text content is minimal, usually innocuous and unrelated to the forum's topic, or in a very old thread that is revived by the spammer solely for the purpose of spamming links. Some text is included to prevent the post being caught by automated spam filters that prevent posts which consist solely of external links from being submitted. Full banner advertisements have also been reported.

Alternatively, the spam links are posted in the user's signature, in which case the spambot will never post. The link sits quietly in the signature field, where it is more likely to be harvested by search engine spiders than discovered by forum administrators and moderators.

Since November 2006, a very destructive forum and wiki spam attack has been propagated by inserting into comments redirect domains with an automated posting script like XRumer
XRumer
XRumer is a Windows blackhat SEO program that is able to successfully register and forum spam with the aim of boosting search engine rankings. The program is able to bypass security techniques commonly used by many forums and blogs to deter automated spam, such as account registration, client...

. These domains redirect a user to pornographic websites. If a user clicks on the image or attempts to close the Website an ActiveX
ActiveX
ActiveX is a framework for defining reusable software components in a programming language-independent way. Software applications can then be composed from one or more of these components in order to provide their functionality....

 codec
Codec
A codec is a device or computer program capable of encoding or decoding a digital data stream or signal. The word codec is a portmanteau of "compressor-decompressor" or, more commonly, "coder-decoder"...

 will be downloaded as a Zlob Trojan
Zlob trojan
The Zlob Trojan, identified by some antiviruses as Trojan.Zlob, is a trojan horse which masquerades as a needed video codec in the form of ActiveX...

. The spambot can often bypass many of the safeguards administrators use to reduce the amount of spam posted.

Effects of spam

Spam prevention and deletions measurably increase the workload of forum administrators and moderators. The amount of time and resources spent keeping a forum spam free contributes significantly to labor cost, and the skill required in the running of a public forum. Marginally profitable or smaller forums may be permanently closed by administrators.

Spam prevention

  • Flood control: This forces users to wait for a short interval between making posts to the forum, thus preventing spambots from flooding the forum with repeated spam messages.
  • Registration control:
    • Some forums employ CAPTCHA
      CAPTCHA
      A CAPTCHA is a type of challenge-response test used in computing as an attempt to ensure that the response is generated by a person. The process usually involves one computer asking a user to complete a simple test which the computer is able to generate and grade...

       (visual confirmation) routines on their registration pages to prevent spambots carrying out automated registrations. Simple CAPTCHA systems which display alphanumeric characters have proven vulnerable to optical character recognition
      Optical character recognition
      Optical character recognition, usually abbreviated to OCR, is the mechanical or electronic translation of scanned images of handwritten, typewritten or printed text into machine-encoded text. It is widely used to convert books and documents into electronic files, to computerize a record-keeping...

       software but those that scramble the characters appear to be far more effective.
    • Alternative is Textual Confirmation, where the user answers one or more random questions to prove he/she is not a spambot.
    • Forums have a feature where they send an e-mail to users who registered, either containing the password used to login or an activation code/link.
    • Some forums have required registration approval where the administrator has to approve accounts.
  • Authoritative voice: Using an external filtering service, such as Akismet
    Akismet
    Akismet or Automattic Kismet is a spam filtering service. It attempts to filter link spam from blog comments and spam TrackBack pings. The filter works by combining information about spam captured on all participating blogs, and then using those spam rules to block future spam...

    , to get a verdict if the data is spam or not.
  • Posting limits: Limit posting to registered users and/or require that the user pass a CAPTCHA test before posting.
  • Registration restrictions: Applying careful restrictions can seriously impact bogus and spambot registrations. One approach consists in the denial of registration from certain domain extensions that are a major source of spambots such .ru, .br, .biz, or freebase addresses such as "gawab
    Gawab
    Gawab.com is a Web-based email service provider in the Middle East region. A WHOIS lookup shows a registrant in Alexandria, Egypt. Established in 1999, Gawab is amongst a small group of email providers that provide single-page application webmail interface built on AJAX technology. The word "gawab"...

    .com". Another, more labor-intensive, consists in manual examination of new registrants. This examination looks at several indicators. First, spambots often delay email confirmation by several hours, while humans will confirm promptly. Second, spambots will tend to create user names that are unique, and unlikely to already be used in the forum, preferring "John84731" or "JohnbassKeepsie" to the much more common "John." Third, using a search engine to investigate, one finds hundreds, if not thousands of profiles using the spambot login name, sometimes with the diagnostic spam post, or "banned" label.
  • Changing technical details of the forum software to confuse bots — for example, changing "agreed=true" to "mode=agreed" in the registration page of phpBB
    PhpBB
    phpBB is a popular Internet forum package written in the PHP scripting language. The name "phpBB" is an abbreviation of PHP Bulletin Board...

    .
  • Block posts or registrations that contain certain blacklisted words.
  • Be wary of IPs used by untrusted posters (anonymous post
    Anonymous post
    An anonymous post is an entry on a bulletin board system, Internet forum or message board, blog, or other discussion forum without a screen name or more commonly by using a non-identifiable pseudonym....

    s or newly registered users). A useful technique for proactive detection of well-known spammer proxies is to query a search engine for this IP. It will show up on pages that specialize in the listing of proxies.
  • Some forums also have their own "spam subforums" to direct spam off their main site.
  • Some forums have the signature option disabled.

Page widening

Page widening is the intentional or accidental act of posting a long string of unbroken characters or a wide image to a forum, increasing the web page's width excessively, to the point where other users cannot read the text without scrolling the screen left and right. Page widening is undertaken by internet trolls who wish to render a page harder to read, and is one of the Slashdot trolling phenomena.

Page widening can be triggered by a wide image, a very long string of characters without breaks, a long line with the specification that the browser should not break it (for instance, use of the HTML tags <pre> or <nobr>), a table with many columns, in particular if columns contain a long word (the minimum width of a column is the width of the longest word in it) or a table where the HTML specifies a large width.

Although some forums detect and prevent such page widening, often by inserting spaces into excessively long text strings, some forum software fails to take into account that the reader may be using a lower screen resolution (such as on a PDA
Personal digital assistant
A personal digital assistant , also known as a palmtop computer, or personal data assistant, is a mobile device that functions as a personal information manager. Current PDAs often have the ability to connect to the Internet...

 or mobile phone
Mobile phone
A mobile phone is a device which can make and receive telephone calls over a radio link whilst moving around a wide geographic area. It does so by connecting to a cellular network provided by a mobile network operator...

, a smaller window size or a larger font.
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK