MailScanner

Mailscanner Recently installed an update/add-on to cPanel for Birdhouse Hosting – a package called MailScanner which integrates the usual complement of open source spam and virus controls (SpamAssassin, ClamAV, Razor, DCC) into a combined package, provides more spam config controls for individual hosting accounts, and provides the admin with a bunch of reporting tools. I can now see at a glance (graphically) how many messages are passing through the server each day, what percentage of them have been flagged as spam or virii, or drill down and get similar reports for individual domains or users. At left: A snapshot of mail and spam traffic on Birdhouse over the past week:

Highlights:
10,000 total messages processed on 10/16
77.8% of mail was flagged as spam today
(read that last one another way: less than 23% of the mail we’re spending money to process and handle is legitimate)

If you’re wistful for the good old days when you could use a “catch-all” address to receive mail bound for anything@yourdomain.com, note: 5,016 out of 6,449 messages received today were addressed to unknown email accounts on domains we handle. Which is why most hosts (including Birdhouse) strongly recommend against using catch-all addresses any more. Spammers 0wnz0r the ozone.

Music: Tom Glazer & Dottie Evans :: Constellation Jig

Extreme Telemarketing

Never thought I’d feel sympathy for a telemarketer, but get an earful of this. My heart goes out to the poor guy. Kind of. Despite the caller’s general craziness, she does raise a point with him that I’ve tried before in conversations with telemarketers: The practice violates the categorical imperative, from which all moral action derives (according to Kant, and I agree):

Act only according to that maxim by which you can at the same time will that it would become a universal law.

In other words, don’t do anything that you don’t think all other people should also be allowed to do in the same situation. In context, one should engage in telemarketing only if one believes that all marketers should be allowed to call people in their homes. Consider the massive amount of advertising around us at all times, and imagine that every advertiser pushed their product by calling people at home. Universalizing the practice of telemarketing to all practitioners would make the telephone utterly useless, since it would never stop ringing, much as e-mail spam has diminished the viability of e-mail (which is only rescued through the application of great piles of technology).

When faced with the categorical imperative (though of course the caller does not call it that :), the telemarketer starts to lie to cover his position, saying that marketers do call his home phone all day every day, and that he doesn’t mind a bit.

Unfortunately, the caller’s philosophically sound position is completely blown out of the water by absolutely insane levels of hysteria.

Music: Sylford Walker :: Deuteronomy

Spam Plants

Spamplants Romanian-born computer artist Alex Dragulescu turns crap into gold — he’s developed a computational analysis system to transform ordinary spam into renderings of organic-looking plants (though some look more like sea anemones to me). via c|net:


For the Spam Plants, he parsed the data within junk e-mail–including subject lines, headers and footers–to detect relationships between that data. For example, the program draws on the numeric address of an e-mail sender and matches those numbers to a color chart, from 0 to 225. It needs three numbers to define a color, such as teal, so the program breaks down the IP address to three numbers so it can determine the color of the plant. The time a message is sent also plays a role. If it’s sent in the early morning, the plant is smaller, or the time might stunt the plant’s ability to grow.

Dragulescu has also done similar projects with architecture, weblog text, transit, etc.

Music: Lou Reed and John Cale :: Nobody But You

reducer: bad ips –> firewall

At the end of my rope with server loads caused by weblog and email spammers. SpamAssassin and Akismet etc. may keep spam away from users, but all that stuff still needs to be processed (and we’re talking about a huge percentage of all traffic).

Recently switched from the APF firewall to ConfigServer’s excellent CSF, which is integrated into WebHost Manager (the admin back-end for cPanel systems), and got thinking — the most heavily trafficked blogs here are already using spam rating systems that track IPs. The right script could harvest and rank those IPs and load them into the firewall in near real-time. Spent the past few evenings building a shell script to do just that.

reducer: Harvests bad IP addresses from multiple sources and adds them to the CSF firewall for cPanel systems. This version works with WordPress and Movable Type weblogs, and optionally the exim ACL deny system. Future versions will scan other sources for bad IPs as well.

Update, April 2008: Birdhouse Hosting has been running reducer system-wide for almost two years now, with great success. At this point, we wouldn’t even consider running a hosting business without it.

Download reducer here.

Blogosphere Suffers Spam Explosion

c|net on the increasingly difficult problem of fighting spam on weblogs:

Boing Boing would allow its readers to leave comments and engage in a discussion on the wildly popular blog, if it weren’t for spam.

The piece focuses more on problems bloggers themselves face:

“It is a major hassle,” Frauenfelder said. “It is just getting worse and worse. My fantasies of violent revenge against spammers become more lurid every week.”

than on problems caused for their web hosts, and is a superficial overview in many respects, but it’s good to see some mainstream attention to the problem, which consumes more of my time than I had ever imagined it would.

At this point, I’ve tried every approach under the sun for the Birdhouse bloggers: standard blacklists (a moving target), moderation and authentication (chilling effect on conversation), mod_security blacklists (hard to keep updated, resource intensive), javascript (ultimately hackable), referrer tracking (shuts out commenters behind certain firewalls)…

But I’ve never had it as easy as I have since switching to WordPress and setting up the distributed Akismet system, which has blocked more than 1,000 spams from this blog in the past two weeks without a single false positive, and while requiring very minimal system resources. Sounds like a lot, but some of my users average around one spam/trackback submission attempt per minute, 24×7. You do the math.

Music: The Flaming Lips :: What Is The Light?

Technorati Tags:

Who Gets No Spam?

Lebkowsky posts about his mostly-rosy transition from Outlook to Thunderbird, but wonders why the spam controls aren’t more robust. “… and though the junk mail filters are clearly catching a large percentage of the umpty hundreds of spams that fall into my mail bucket every day, there’s a bunch more that the filters miss.”

What I don’t get is why people are still dealing with daily buckets of spam on the client side at all. It’s been years since most mail hosts began offering excellent server-side spam handling (Birdhouse included). I’ve found the combination of SpamAssassin + ClamAV + RulesDuJour to be tremendously effective. And don’t forget to disable your “catch-all address — probably the most powerful single spam magnet you can have. After months of not landing a single false positive, I finally stopped using a server-side “Junk box” for monitoring at all – now I just set my spam threshold to 2.5 and let the systems delete spam before it ever hits my server-side mailbox.

Result: About 90% of the mail bound for my addresses is discarded without ever being seen by a human or handled by a mail client. What finally slips through the net is a grand total of about 3-5 spams a day.

On the TWiT podcast, John Dvorak gets teased regularly — by industry experts, no less — for his claim “I get no spam.” What’s so outlandish about that? If you’re still getting spam in your mail client, you probably just need to turn on the controls your mail host probably already has set up for you. And if your mail host doesn’t offer server-side spam controls, find one that does.

Music: Half Man Half Biscuit :: Bottleneck at Capel Curig

SpamLookup

Just installed Brad Choate’s SpamLookup for John Battelle‘s MT installation, ditching MT-Blacklist for the time being. Looks simple on the surface, but dig into the options and you start to realize this is the next generation comment/trackback-fighting tool. Actually, it’s a whole toolbelt, including realtime distributed blacklists (which probably accomplish 95% of the dirty work alone), moderation levels and exceptions for various types of commenters, bannable wordlists, and a built-in “Passphrase” feature you can use as a human detector. This last being similar in concept to a captcha, but text-based rather than graphical. The commenter is required to answer a dirt-simple question such as “What is John’s name?,” which a bot would be hard-pressed to do. If I wasn’t having such great success with MT-Keystrokes on birdhouse, I’d install it here as well…

Music: Roland Kirk ::Bag’s Groove

NonJunk

Studying email headers of a spam turdlet that slipped through the net, found this in the headers, trying to pass as header lines added by SpamAssassin:

X-IMAPbase: 1113505409 1 NonJunk
Status: O
X-Status:
X-Keywords: NonJunk

The cat-n-mouse game is never-ending.

Music: blur :: country house

Tagging Non-English Spam

Have recently noticed a huge uptick in the amount of non-English (especially Chinese) spam, which slips through the SpamAssasin nets much more readily than English spam (at least it does in most Western SA setups; not sure how different things are for, say, Chinese hosts).

Turns out you can tell SpamAssassin to give higher ratings to messages written in languages other than those you’ve explicitly sanctioned. Higher ratings mean more likelihood of messages getting tossed to /dev/null or saved in a junk box. In your local.cf or user_prefs, just add:

ok_languages en de la th sv

(e.g.) to accept messages in English, German, Latin, Thai, and Swedish. Full list of language codes here. Works a treat.

Music: The Seeds :: 900 Million People Daily

Field Notes on Comment Registration

In order to respond to Birdhouse customers who want an answer to the question: “Why are you enforcing comment registration on Movable Type weblogs? Have you really exhausted all other options?,” I’ve put together this Brief History of Our Battle With Comment Spammers to summarize what we’ve done in the past, why it hasn’t worked, and why we think comment registration is our only remaining recourse.
Continue reading “Field Notes on Comment Registration”