Friday April 11, 2008
Fear, Uncertainty and Disinformation About The WordPress Exploits and Spam
I've seen a few ill-conceived suggestions that the measures we've taken at Technorati to suspend updates of blogs that appear vulnerable are coercive and should be countered. Let's just put this nonsense aside. When the XML-RPC exploits first caught my attention in February (two months ago), I was seeing five or ten, sometimes a few dozen blogs per day popping up on our radar with severely unusual publishing characteristics. I talked to Niall and Matt about it, learned about the hole that 2.3.3 fixed and posted about it on the Technorati blog urging bloggers to Patch or Upgrade Your Wordpress Installation, Now.
So here are the bare facts: Around the tail end of March, the problem really snowballed. Kevin Burton
put up a series of posts that caught my attention last month so we started comparing notes. This week in Technorati's crawl data, hundreds and sometimes thousands of vulnerable blogs everyday are showing up hacked regardless of rank, language or posting frequency. Why does this matter? All search systems that index links (Technorati, Google, Yahoo!, Ask, etc) have to discount the value of pages that are publicly writable. Wiki's, un-moderated/un-controlled comments and so forth are invariably spammed and that degrades the value of those pages. To prevent blogs from being classified as splogs just because they were hacked, we implemented the change announced at the beginning of this week Vulnerable WordPress Blogs Not Being Indexed
. Please read this carefully: In that post, we said we were going to stop processing the crawls if the blog appeared symptomatic
. We never said we were "de-listing" or "banning" blogs, yet there are
posts out there twisting the facts to the contrary. Let's address their points head-on
- Fear: Being New Doesn't Make WordPress 2.5 More Secure
- This is Dubya-esque illogical FUD. Nobody ever said "new release"=="secure". The thinking there is: Even if there aren't known exploits of 2.5 but there are of the legacy releases, you should still fear the devil you don't know more than the one you do. Which is unabashed crap. In the case of WordPress, "old release"=="insecure" evaluates to true. Period. Hundreds of blogs or more are proving it everyday.
- Uncertainty: WordPress 2.5 is "broken"?
Thousands of blogs are upgrading everyday without a hitch. If the WordPress developers broke backwards compatibility for your particular plugins and themes, there are reportedly patches for the other major code-lines in WordPress:
From what we can tell, the patched releases for the 2.0.x and 2.1.x code lines have had statistically insignificant adoption, which is why we're just suggesting that people upgrade. As far as API compatibility goes, this sounds like a problem that needs to be taken to the WordPress community for resolution. Bloggers should weigh the value they're getting from incompatible plugins against the impact of getting hacked.
|Code Line||Patched Release|
- Disinformation: Technorati is "dropping" un-upgraded blogs
We're not "de-listing", "dropping", "disappearing" or anything of the sort. One commenter went so far as to post his own made-up statistics, that we're dropping "85-90% of the blogs published on" WordPress. Totally not the case, the truth is that blogs that are symptomatic will not be updated, they will grow stale in our index until they cease appearing symptomatic. The number of crawls effected are significant but percentage-wise, in the single digits. Taking advice to remove or put misleading generator tags and other "counter-measures" is actually counter-productive. If the suspension evaluation is defeated, and the crawl gets processed, an exploited blog will likely fall into our splog classification systems, mis-flagging it and, in that case, it really will be disappeared. Why do we allow this to happen? Here's a fact that is known to few who don't work on search systems or who aren't spammers: legitimate blogs get disowned and taken over by spammers all of the time. This happens with lapsed domain registrations, deleted blogger blogs (blogger's URLs get recycled), and so forth. Spammers love to get established URLs 'cause they often have page rank and other goodies associated with them. However, once a blog starts publishing spam links, all of the major link processing systems will classify it as a splog, the value of the URL diffuses and degrades; eventually dropping out of searches.
I usually restrain myself from responding to trolls but the impacts we're seeing on the blogosphere are too important to let the fallacies and fear mongering go unchallenged. Don't pay attention to those who are trying to profiteer, making hay about Technorati being "bullies" or trying to "tell people how to blog." That's just outright nonsense. Techorati is not doing anything coercive at all, it's protecting the community by quarantining the infected. Technorati is simply suspending updates on the hundreds of blogs that are popping up as being vulnerable and appearing symptomatic of being hacked. Technorati is a small company seeking to be of service to a very large community. Amidst that community, a lot of bad actors (not the Keanu Reeves kind) are expending considerable effort to hijack the fundamental currency of the real time web: time and attention. We would be remiss if we didn't expend our efforts to thwart them.
( Apr 11 2008, 10:33:17 AM PDT )