Reverse engineering a referer spam campaign

It looks like someone’s launched a new referrer spam campaign today, there’s a huge uptick in traffic here. The incoming requests are from all over the internet, presumably from a botnet of hijacked PCs, but it looks like all of the links point to a class C network at 85.255.114 somewhere in the Ukraine.

It’s interesting to think a little about link spam campaigns and what opportunity the operators hope to exploit. Two major types of link spam on blogs are comment spam and referrer spam. My perception is that comment spam is more common. Most blogs now wrap outgoing links in reader comments with “rel=nofollow” to prevent comments links from increasing Google rank for the linked items, but the links are still there for people to click on.

Referrer spam is more indirect. It is created by making an HTTP request with the REFERER header set to the URL being promoted. Most of the time, this will only be visible in the web server log.

Here is a typical HTTP log entry:

87.219.8.210 	[04/Feb/2006:15:20:35 	-0800]
    GET 	/weblog/archives/2005/09/15/google-blog-search-referrers-working-now 	HTTP/1.1
    403 	- 	\"http://every-search.com\"

Some blogs and other web sites post an automatically generated list of “recent referrers” on their home page or on a sidebar. In normal use, this would show a list of the sites that had linked to the site being viewed. Recent referrer lists are less common now, because of the rise of referrer spam.

Referrer spam will also show up in web site statistic and traffic summaries. These are usually private, but are sometimes left open to the public and to search engines.

One presumed objective of a link spam campaign is to increase the target site’s search engine ranking. In general this requires building a collection of valid inbound links, preferably without the “nofollow” attribute. Referrer spam may be more effective for generating inbound links, since recent referrer lists and web site reports typically don’t wrap their links with nofollow.

The landing pages for the links in this campaign are interesting in that they don’t contain advertising at all. This suggests that this campaign is trying to build a sort of PageRank farm to promote something else.

The actual pages are all built on the same blog template, and contain a combination of gibberish and sidebar links to subdomains based on “valuable” keywords. Using the blog format automatically provides a lot of site interlinking, and they also have “recent” and “top referer” lists, which are all from other spam sites in the network.

It looks like the content text should be easy to identify as spam based on frequency analysis. Perhaps having a very large cloud of spam sites linking to each other along with a dispersed set of incoming referrer spam links makes the sites look more plausible to a search engine? These sites don’t appear to have any, but I have come across other spam sites and comment spam posts that have links to non-spam sites such as .gov and .edu sites, perhaps trying to look more credible to a search engine ranking algorithm. All the sites being on the same subnet makes them easier to spot, though.

Given that there aren’t that many public web site stat pages and recent referrer lists around, I’m surprised that referrer spamming is worth the effort. If the spam network can achieved good ranking in the Google and the other search engines, they can probably boost the ranking for a selected target site by pruning back some of their initial links and adding some links pointing at the sites that they want to promote. Affiliate links to porn, gambling, or online pharmacy sites must pay reasonably well for this to work out for the spammers.

More reading: A list of references on PageRank and link spam detection.

If you’re having referrer spam problems on your site, you may find my notes on blocking referer spam useful.

Here’s some sample text from “search-buy.com”:

I search-buy over least and and next train. Ne so at cruelty the search-buy in after anaesthesia difficulty general urinating. T pastry a ben for search-buy boy. An refuses trip search-buy romances seemed azusa pacific university ca. Stoc of my is and search-buy direct having sex teen titans. Kid philadelphiaa would and york search-buy. G search-buy wore shed i dads. obstacles future search-buy right had satire nineteenth. The that i ups this on search-buy least finds audio express richmond. have this window been wonderful me search-buy so. Surel in actually search-buy our boy deep franklin notions. An search-buy it of my has of. To at head boy that a search-buy. O james search-buy everywhere of but. Alread originate search-buy good about since.

Here are a few spam sites from this campaign and their IP addresses:

bikini-now.com          A       85.255.114.212
babestrips.com          A       85.255.114.229
search-biz.biz          A       85.255.114.245
bustytart.com           A       85.255.114.250
cjtalk.net              A       85.255.114.227
search-galaxy.org             A       85.255.114.252
moresearch.org             A       85.255.114.237

Here is the WHOIS output for that netblock:

% Information related to '85.255.112.0 - 85.255.127.255'
	
inetnum:        85.255.112.0 - 85.255.127.255
netname:        inhoster
descr:          Inhoster hosting company
descr:          OOO Inhoster, Poltavskij Shliax 24, Kharkiv, 61000, Ukraine
remarks:        -----------------------------------
remarks:        Abuse notifications to: abuse@inhoster.com
remarks:        Network problems to: noc@inhoster.com
remarks:        Peering requests to: peering@inhoster.com
remarks:        -----------------------------------
country:        UA
org:            ORG-EST1-RIPE
admin-c:        AK4026-RIPE
tech-c:         AK4026-RIPE
tech-c:         FWHS1-RIPE
status:         ASSIGNED PI
mnt-by:         RIPE-NCC-HM-PI-MNT
mnt-lower:      RIPE-NCC-HM-PI-MNT
mnt-by:         RECIT-MNT
mnt-routes:     RECIT-MNT
mnt-domains:    RECIT-MNT
mnt-by:         DAV-MNT
mnt-routes:     DAV-MNT
mnt-domains:    DAV-MNT
source:         RIPE # Filtered
	
organisation:   ORG-EST1-RIPE
org-name:       INHOSTER
org-type:       NON-REGISTRY
remarks:        *************************************
remarks:        * Abuse contacts: abuse@inhoster.com *
remarks:        *************************************
address:        OOO Inhoster
address:        Poltavskij Shliax 24, Xarkov,
address:        61000, Ukraine
phone:          +38 066 4633621
e-mail:         support@inhoster.com
admin-c:        AK4026-RIPE
tech-c:         AK4026-RIPE
mnt-ref:        DAV-MNT
mnt-by:         DAV-MNT
source:         RIPE # Filtered
	
person:         Andrei Kislizin
address:        OOO Inhoster,
address:        ul.Antonova 5, Kiev,
address:        03186, Ukraine
phone:          +38 044 2404332
nic-hdl:        AK4026-RIPE
source:         RIPE # Filtered
	
person:       Fast Web Hosting Support
address:      01110, Ukraine, Kiev, 20Á, Solomenskaya street. room 201.
address:      UA
phone:        +357 99 117759
e-mail:       support@fwebhost.com
nic-hdl:      FWHS1-RIPE
source:       RIPE # Filtered
Tags: , , , , , ,

 
Google

 

Leave a Reply

  • A Random Selection of Other Fine Posts

  •  
    Translate this page
    German Flag Spanish Flag French Flag Italian Flag Portuguese Flag Japanese Flag Korean Flag Chinese Flag
    Plugin by Taragana
    Google
    Web hojohnlee.com

    •  

     

     
     

    © 2004-2008 Ho John Lee