Me Versus Spam: Mortal Combat
March 04, 2007
One thing that I continually have to fight is spam. Regardless of whether it comes in by my email or by a contact forum it seems like I’m having to fight it more often these days. So, I’m finally getting around to writing the article I should have written months ago, the steps I’m taking to fight spam from my contact forms.
First Things First:
I’m not necessarily going to write about how to block spam in your email, I figure there are plenty of articles that focus on that topic. Where I can be the most helpful is by providing the PHP that I use to stop the majority of the spam I get. I’m also going to include the piece of PHP that I recently added to keep people from using my contact form as a way to mass spam the world. I’m sure there’s a better way to do what I’m going to write about, this is just my way of doing it. I’m not the PHP master and I can’t guarantee my code is going to work for your setup. So, with that being said, let’s continue.
Contact Form Trigger Mechanism.
The contact form I use has two pieces, the page where you enter the data and the page that processes the data. The first page, in my case, is a simple form that gathers the visitor’s name, email address, subject and message. Once they have entered their information the form processes the content and drops it into an email to be sent to my mailbox. Simple as that should be, it’s always more complicated in the real world. The complexity comes when people write a variety of nonsensical data intended to make me click a link and buy something that I don’t want. The way I’ve started fighting this is by adding in a little data checking. Obviously the person isn’t going to need a URL in their name, email, or subject. The only appropriate place for a URL is possibly within the content of the message. So, I typically write a line of PHP that looks like this: ($name has been connected with the name input box from the submission form, but you probably figured that.)
if (eregi("http://", $name)) {
die ("Sorry, we cannot process this form as completed, please try again.");
}
This basically says, check out the variable $name and if it has the word http:// within it, stop running. You can use this on any variable that should not have a URL inside it. I have found that usually one or two of these in random places will stop the majority of spam coming in through the form.
Spam And Contact Form Abuse.
For awhile I wasn’t getting much spam, but I was getting messages that told me mail that I had sent was not able to be delivered to the recipient. That worried me because I wasn’t sending any mail at the time from that address. So, I emailed my web host and they responded with “don’t worry, it happens to everyone. Spammers are just forging your email address, they’re not actually exploiting your scripts.†Then I started getting emails in my inbox with my stamp on them, the one that I had snuck into the bottom of the email form as a reference of legitimacy. So, I went to war with the people exploiting my PHP script. They were using a common technique, data injection. Basically they write in a particular set of code that will send whatever content they enter to anyone they want. All at the expense of my sanity and bandwidth. It would appear to the many recipients that the email they were receiving was coming from my website, that just couldn’t continue to happen. So, I added in another data check. This time it verified that they weren’t able to inject the BCC or CC fields into the forum and still have the email processing piece send it. That code looked like this:
if (eregi("bcc", $name)) {
die ("Sorry, we cannot process this form as completed, please try again.");
}
This example is similar to the previous one. It checks the variable called $name for the value “bcc:†which was a piece of the code necessary to send the message to the masses. If bcc: is found the script stops running. Since most of the spam is coming by way of a bot that only relies on the mechanism to send that data (ie. Your submission button) it doesn’t know that the data has not been sent and goes on to exploit someone else’s contact form. Essentially it turns my email form into a passive aggressive messenger, for once I like something that is passive aggressive.
One Last Technique
The other thing I do when I have to write in my email address on a website or in plain web readable format is to “mung†the data. Munging is basically a way of converting your plain text ascii into decimal values. The bots out on the web right now aren’t necessarily looking for that type of data so they’ll often pass over your email address because it appears to be garbled decimal values. On the other hand your web browser is fully capable of decoding that data into a useful format. So, that results in your users still being able to see your email in plain text. If they click on your email address it will still open up their default mail program and enter your real address into the send box as if it weren’t “munged.†I will say that I can’t guarantee that’s going to work for very long. On the one side of things bots aren’t looking for it now because they don’t have to. The majority of the web isn’t using that renaming scheme. In the future if more people adopt that technique it won’t continue to work, the bots will evolve if they haven’t already.
Conclusion
I’m sure in a year or two I’ll have to change the focus of how I handle the issue of spam. For now this is at least cutting down some of the junk I’m getting. I hope you found this information useful.
Share on Twitter | Share on Facebook | Bookmark on Delicious |
Recently on Twitter
Latest Dribbble Work
Last.fm Playlist