In news almost too post-modern to digest, Scott Richter, the Internet’s third-busiest spammer, has decided that his spammy brand has become recognizable enough to be marketable in its own right. To capitalize, he’s launching a line of SpamKing clothing — initially shirts, hats, and panties bearing phrases such as “Just opt out” and “Click it.” You can take a wild flying guess how he’ll be promoting the line. Adding an additional layer to the strangeness, Richter is not generally referred to as “The Spam King” — that honor usually goes either to Bill Waggoner or Alan Ralsky, or, going farther back, Sanford Wallace. So he’s apparently usurping the title from his fellow spammers.
Installed ClamAV virus definition scanner — an open source virus detection module to be used in conjunction with mail transfer agents. cgpav provides the glue to use clamd in conjunction with CommuniGate Pro. freshclam updates the virus definition tables hourly.
Attention! You sent an infected message with the VIRUS: Eicar-Test-Signature It was rejected for delivery.
With the addition of Razor, very little spam is getting through my gateway — Razor made an incredible difference (as I expected it would, since it’s human/collaborative). The remaining gravel in the shoe is all of the autoresponder fallout from MyDoom.
Hair On Your Own Back
I like the spams where they use banal or left-field subject lines to fool you into reading the spew. I’ve been saving some of them up. In the past week:
“Are you a junky?” = Viagra
“Do you want a bagel?” = Get plump, sexy lips in under 30 days
“Monotheism” = A harder, longer man-thing
“The pre-storm darkness” = Free cable TV
Dyslexic tendencies, sometimes with humorous consequences. Just scanning the spam box and saw one that I was sure read “Get hair on your own back!” Upon closer look, turned out to be: “Get your own hair back.”
Spam Auto-Kill Count
Birdhouse Hosting uses SpamAssassin in conjunction with CGPSA to tag all inbound email message headers and delete msgs that meet a given threshold for spammy-ness before they’re ever downloaded by customers. CommuniGate logs are set to roll over every 7 days. Wrote a simple script that queries the CGP logs for discard events and outputs the result count to an include which you’ll now find in the “Tech Crap” section to the right (and in this post). This number will grow as A) the proportion of spam in the wild continues to grow and B) the number of birdhouse customers using the auto-kill feature grows.
Spams auto-killed by CGPSA in the past 7 days: (refreshed hourly).
I’ve complained about comment spam before, but the problem has really swollen out of all proportion over the past two weeks. Because the phenomenon is relatively new, Movable Type has no simple mechanism for handling it, other than to ban IPs (or entire triplets). Deleting comments and rebuilding posts is cumbersome.
This weekend, one of the J-School’s blogs, bIPlog, got hit hard, and a student spent hours deleting Lolita comments. In the nick of time, Jay Allen released MT-Blacklist, which totally supersedes his previous MTMacro solution. Comes with a database of 450 known evil URLs and ability to post your updated blacklist to a known location for automated sharing. Also modifies the comment emails that MT generates to include an additional “de-spam” link – clicking it lets you delete the comment, rebuild the page, and add the spammer to your blacklist all at once. If you’re running multiple blogs from one installation, you can turn MT-Blacklist on or off for any arbitrary subset of them.
Installing MT-Blacklist on birdhouse and on the J-School today felt triumphant — as if the whole episode had been a battle between good and evil, and evil was winning… until the Megatron DDT Squirtation Assembly arrived to vaporize all the cock-a-roaches.
I would like to have Jay Allen’s baby. He is a god.
On the normal spam front, I like this idea: Filter That Fight Back. Short version: create client-side spam filters that purposely follow/spider every link in a spam. Spammer sends out a million emails an hour, they get in return with a million hits an hour. “The branch snaps back in their face.” Punish them with the traffic they’re looking for. Crush them with it. Very Tae Kwon Do.
Killing Comment Spam Dead
Started receiving comment spam on this weblog around May. Began as a curiosity, but eventually grew into an annoyance (interestingly, comment spammers were inordinately targeting that very post, over and over again, as if out of spite). But in the past few weeks, it’s become a major hassle. Unlike email spam, dealing with comment spam in MT requires visiting the IP Ban section of the back-end and entering the commenter’s first two triplets (to account for dynamic IP assignment, at the risk of banning some innocents), searching for the entry, deleting the bum comments, and rebuilding the entry.
But recently Michael Bazely (who is, coincidentally, Oaklandish!) pointed out Jay Allen’s ingenious confabulation of a few freely available MT plugins and a couple of tweaks to the default comment template, all of which conspire to provide an MT blacklist for comment spammers.
The beauty part is that Jay’s system doesn’t ban arbitrary objects like IP numbers — it goes for the jugular by banning what comment spammers really want to appear — their URLs. We’ll see how it goes, but initial tests show it working perfectly.
Old hat, but thought I’d throw a monkeywrench into the spammer’s game with a local dose of wpoison. Back in ’97, a spammer told Wired that this stuff didn’t work – that his Extractor bot could add 4,000 – 5,000 bounces an hour to a rejects list. But this script is infinitely recursive — unless the spambots are sufficiently clever, they should get caught in it indefinitely. And if enough people ran similar pages, surely it would make some difference. Yeah, I know, wishful thinking. Ah well, it’s free and can’t hurt.
The new era of weblog comment spam is upon us.
Google determines rank in search results depending on # of incoming links from other sites. Posting a comment w/URL on someone’s else’s site causes Google to “like” the commenter’s site more. So essentially someone is hijacking my comments system (and probably lots of other blogs’ comments systems) to abuse Google’s algorithms.
Back when Alta Vista was the King of Search, META keyword stuffing was the primary mechanism of search rank abuse. Google had seemed to put an end to that, but where there’s a will… Clever. Though stupid that they would drop both fake comments on a single post out of nearly a thousand, three months apart. Also, it’s hard to imagine these not looking suspicious to any blog owner.
Update, 10/15/03: As it turns out, this has become the single-most spammed-upon entry at all of birdhouse. If you are reading this without having had to wade through tons of spam, it’s because I’ve deleted tons of them manually, and (later) because excellent tools such as MT-Blacklist have made dealing with rising blog spam much more manageable.
SpamAssassin / Vipul’s Razor
Realtime Blacklists were working very well — I had seen no false positives in weeks and 90% of spam rejected at the gate — but a customer complained that mail they actually wanted was being rejected as spam (this is what happens when some of your customers are marketing types). No false positives allowed. Disabled RBLs a week ago, then set up SpamAssassin via CGPSA (SpamAssassin as a CommuniGate Pro module). Tonight added Vipul’s Razor to the mix, which works by keeping track of what humans around the world consider to be spam. So the SA/VP combination is essentially a machine detection plus human detection method. Will need to let it run and tweak the tolerances a bit, but if all goes well, this should both stem the spam spigot in my own inbox and give customers the ability to do same.
Another difference between this methodology and the RBL technique is that I am no longer globally rejecting spam at the server level no matter how high its score — now that we have proper tagging, customers can configure the server to delete their own spam at the server level, or let it pass through tagged and delete it at the client level. Elegant.
After a week of using sbl.spamhaus.org and bl.spamcop.net as realtime blackholes via CommuniGate, prevented around 50 msgs/day from getting through to user accounts without a single false positive. Had been worried about false positives, but feel good about this. Still letting 10-20% of spam through though, so still want to set up SpamAssassin when time allows.