The weblog comment spam problem has implications beyond crowded inboxes for users. Even with tools such as the incredible MT-Blacklist (which has blocked or moderated tens of thousands of comment spams on birdhouse-hosted blogs in the past few months), each request still requires a CGI process and a database request. When the spambots launch their massive onslaughts, shared hosting environments reel from the resource requirements. The problem has reached a critical threshold, and the muckety mucks at SixApart are coming out of the woods to address it head-on:
Jay Allen (author of MT-Blacklist and Product Manager at Six Apart) and Anil Dash (big cheese at SixApart) have both posted “official” positions on MT comment spam in the past few days.
So it looks like patches will be released in the next few days to address the biggest issues for web hosts. I like the fact that they’re approaching this not just as an MT problem but as an issue that affects all online discussion forums. The key to satisfying frustrated web hosts will be in creating a solution that can somehow block comment spam blitzkriegs without having to make a CGI and/or database call for every incoming request. It’s a hard problem to solve.
Update: Very good read on the many aspects and dimensions of comment spam load issues over at photodude. Throwing more hardware at the problem doesn’t make it go away (drooling over the server described there). Long comment section, also worth reading. One comment on the question of whether dynamic or statically generated sites fare better under this kind of load:
Also, last month, my husband and I shut down WordPress on the colo server we share with 3 other people, because … hits from comment spammers were making everything so slow. So we installed prerendering, which, if I’m reading this correctly, takes away the advantage of WP being dynamic(?) [right – this would make a dynamic site behave like a static site; you can’t win. -SFH].
This problem is becoming one pain the arse. After some spammer desided to start WWIII and bring the server my website lives on to the ground a couple weeks ago I was forced to
a) leave comments turned off ..or
b) come up wont some way to apply enough duct tape to the problem to make it seem like it is fixed.
I have managed almost completely eliminate the number of spam messages that get through just by renaming the mt-comments.cgi file and adjusting the mt.cfg file to point to the new comment script.
The question that remains in my head is .. why cant 6a just add some sort of human verification to the comments.. you know.. those little pictures with garbled up numbers and letters or words that you have to copy into a field in order to post. I think that would be a sound method.
Hey Zach — nobody is alone in this one, that’s for sure.
Renaming the comment and trackback scripts seems to work for a few days or weeks, but trust me – the spammers will update their databases with the new URLs soon enough.
There is a “Captcha” plugin for MT like the one you describe, though I’ve heard it’s a bit buggy…
Prior to my ultimate choice in going with WordPress, I was really impressed with James Seng’s Baynesian MT plug-in that he devised in response to all the people complaining about the accessibility-hostile Captcha plug-in he also wrote. I was so convinced by the argument made for it, I almost went with then-just-announced forking he was doing of Drupal in the hopes that he would incorporate the Baynesian feature into his fork (OK, that last part sounds just plain weird). To this very day, I’m still looking for a learning plug-in for WP.
Although, in going back to copy & paste the URL above, I found the more-recent post Seng wrote about the problems with the Baynesian plug-in. *Sigh*
It’s amazing how many approaches there are to these problems, and how they all have problems. No golden bullet, except to fold up shop and walk away (abstinence training anyone?)
There will come a day when comment spam can be dealt with as effectively as email spam.