Spam and community driven sites

Date: Mon Jan 07 2008
Some of the people out there want to use community driven sites for purposes other than the community would desire. To put it bluntly, spammers are looking to use every trick and hook and crookery way to get their message out there. They seem to not care that they're ruining life for the rest of us. Just as they're willing to bury the email networks in an onslaught of sludge masquerading as email, so are they willing to bury websites in sludge masquerading as website content or comments on website postings.

One strain of spammers have software which can automatically make postings to comment forms on websites. Unless your blog or forum software works to prevent nefarious comments, this software will allow spammers to bury your site with spammy comments and once your site becomes known for hosting spammy comments the search engines and users will shun you.

CAPTCHA and other methods to reducing SPAM covers methods to use in a Drupal driven site to block SPAM. Most of the methods I cover are automated means to spoil SPAM software from doing its job.

Fighting Wiki SPAM contains some interesting thoughts about SPAM on community sites. It's focused on wiki's but some of the points are very generalizable. He doesn't think too highly of automated methods to block SPAM from a community site, seemingly because the automated methods can block legitimate content. He has a point, in that CAPTCHA's and other methods that ask a human to prove they're human, these can also turn people off.

And in some cases CAPTCHA's are plain hard to use or flaky. However not every site that allows comments has a significant community driving it. I believe that if there were a significant community driving a web site, then the community could do its own policing. On the other hand the same software is often used by an individual or small set of people, and the ability of robotic SPAM software to inject sludge into websites can easily overwhelm the capacity of a small group of people.

For example I used to participate in trackbacks -- they're a nice feature, that's a nice way for a blogger to notify another site that a blog posting was made about a posting on the other site. However I found my site being hammered with SPAM trackback notifications. By 'hammered' I mean that I would find thousands of these trackback postings at a time, and it would cost the spammer very little resources to send a trackback to my site. I would dutifully delete page after page of this sludge, only to see more show up the next week. I installed Drupal modules which promised to detect likely spammers using word or phrase matching, blacklist matching, etc, and these methods were very unsatisfactory. The SPAM blocker which did word or phrase matching, well, the spammers would regularly use a new set of words. In the end I turned off the trackback feature, and came to the conclusion that the spammers have available high speed computers and high speed Internet connections, and their software can rapidly overwhelm my ability to manually weed out SPAM. As a result I like certain automated methods to block potential SPAM, but of course not all the methods.