If you track some of the ultracompetitive search spaces, you might have noticed some unusual results over the last year. Perhaps you’ve seen historical sites show up in financial spaces, or a beat box site show for car insurance. Often times we’ll point to these errors as evidence that google has lost a step, or that our client’s site should be there instead of that garbage. Both might be true, but there’s a darker menace at play.
Rather than just throw up our hands, RankHammer decided to work with webmasters from coast to coast to try and figure out exactly what’s going on. We’ve seen a truly nasty ring of compromised wordpress and joomla sites interlinking in a way that is tricking Google in to ranking seeming unrelated sites for highly competitive terms. Given the illegal nature of this activity, one can only imagine how poorly customer information would be treated by the eventual “benefactor” of the hack.
Some people will note that hacking sites to embed links is nothing new. However, I’ll argue that this level of sophistication is very new. Victim sites become both link spreader and value harvester in a way I haven’t seen.
Warning – Technical SEO lies ahead.
How the hack works
The hack works by examining incoming user agents, and varying content based upon the referrer. For example, a default user agent, coming from facebook would see “normal” content. Some cases where content is changed are if the user agent is a search robot. In this case, a massive number of “money” terms are inserted as are links both internal and external. Every one of the external links I followed either went to a current or past victim site. In essence a link wheel is being created of massive proportion, but with some ironic trust passed in from the otherwise clean victims.
The hack is harvested when it sees a default user agent that has a target term coming from a search engine. It will then enter a series of redirects before landing on a lead generator page. Here is an example of the extent of the redirection. The sites in the middle are blacked out for one reason, I’m not certain who is a victim, who is tracking and who is the culprit. I don’t want to point fingers, I want to fix the problem.
As an aside for SEO geekery, the redirects are 302, not a 301 when it’s in monetization mode. While it shouldn’t matter since it’s being shown only to “normal” users and not search bots, I personally wonder if it’s part of what makes this tricky for the engines to detect.
If you are actually logged into google, or searching from a secured location, your search term isn’t passed. Again, the site will look normal.
There is not just one intermediary site involved. I’ve seen sites registered from china to russia, private and public. I know there are alot of people that just want traffic and don’t ask questions. It’s quite possible that the final site in the chain has no idea where the traffic is sourced.
For one, I can say that I’ve been offered SEO traffic on a PPC basis before. This might not be the only way to make that kind of offer to a client, but it’s certainly one of them.
Cleaning up the Mess
Almost all of the versions of the hack that I’ve seen follow a similar form. Htaccess will have been compromised, and will need to be replaced. Before you do so, look around in the file. It will likely call a few unusual names and one seemingly innocuous one.
The unusual names take the form of things like 57113.php or .lib_o3vhce.php (hidden). However the one that needs to be treated with kid gloves is the one that seems the most innocent. images.php will have been inserted. This is actually the file that contains the problem, in the form of an encoded script. This script pulls across the payload of who to send where and how much junk to embed. If you clean everything else and leave that file, the infection will return.
It’s also probable that the config or main index files will have been infected. Check and replace those as well looking for unusual files that are being called as well. One person reported finding a .backdoor.list file or something similar. At least their hackers were kind enough to leave a hint on how to clean.
I wish I had an easy answer here, because I don’t exactly know the path of infection, just the ways to clean it up. This particular hack seemed to affect only out of date installations of both content management systems. The first step is obviously keeping everything up to date. Beyond that I’m not the expert here. I will refer to a few places I’ve found useful tips:
Above all, I’d love to hear in the comments further ideas for prevention, more complete removal instructions, or other experiences with this type of hack.