Now blogging at dkeithrobinson.com | Good Stuff: Web Hosting by Dreamhost

Comment System Dilemma

June 15, 2004 | Comments 31 Comments

Since I’m on a “real world” kick I thought I’d talk for a quick minute or two about the root of most of my Web development and design angst at the moment — comments.

Not comments in the literal sense, although those can be frustrating as well, no, I’m talking about my (Movable Type’s) comment system. The actual mechanism through which you all post comments to this site.

It’s all jacked up.

The way I see it there are two large, very common problems that most comment systems have, and one medium sized problem that seems to be less common but causes me all kinds of grief. First up, spam.

Spam

Yep, that bastard of a problem — spam. It’s been debated to death, many great measures haven been taken to eliminate it and to be honest it’s a much better situation than before.

But spam is still here and it’s still a pain to deal with. I’ve been noticing recently spammers are actually taking the time to enter (almost) relevant comments. I still delete those, but how are we supposed to combat that?

I still spend an hour or so a week working with my blacklist and deleting spam. I’ve chalked this up to a fact of life if I want to keep comments on this site. My guess is that I’ll never fully get around this.

Oh, and URL obfuscation is not a valid solution in my mind. I don’t think it dissuades spammers much and frankly, no matter how I’ve seen it done, I find it very annoying and a pretty big usability problem. I like to know who is behind the comment I’m reading and I’m so used to looking at the status bar I want to keep looking there.

It seems that no matter how clever the solution, spam still gets through some how. Spammers are simply smarter than I am. It just kills me that people are taking advantage of my hard work.

Validation

This isn’t a problem for everyone with comments allowed on their site, but it’s a problem for quite a few.

How do you easily keep your site valid while allowing comments? Well, it can be tricky, but it can be done. In the last few days, with the help of Jacques (who you may remember had a little argument with me over at Dave’s site — we kissed, made up and now he’s helping me out) I’ve been able to eliminate quite a few of my problems, but I’ve still got a way to go.

It’s a tricky problem, and not as easy as some might make it out to be. It takes a bit of effort and some wrangling, but I do think that this is a problem that can be eliminated if the appropriate steps are taken.

Having said that it would be nice to see the people who create the code for the comment systems address this. Expecting me to tackle this stuff is a stretch, how is the average content owner going to be expected to take care of this stuff?

They’re not gonna do it.

Slow CGI scripts

This is a problem that many don’t seem to have, but I’ve had it forever. I’ve spent hours trying to solve this, my host has spent hours trying to solve this and we’ve gotten next to nowhere.

The outcome of this particular problem, aside for the lack of feedback with the comment form which is a huge usability error in my opinion, is that I get multiple comments from folks that I have to then go in and delete.

I try my best to prepare my users for the lag and many people who comment on a regular basis are keen to the problem, but it seems as if there is no permanent solution here aside from maybe switching hosting providers.

I’m not interested in exploring that at the moment.

A Systematic Flaw?

I’ve spent I don’t know how many hours setting up, trying to bulletproof and tweaking my comment system. On average I probably spend 2 or so hours every week editing and deleting comments for a variety of reasons.

In my estimation this isn’t anywhere near ideal. It is, and seemingly will be for the time being, a less than ideal situation.

I break it down to goals. My comment system fails my own maintenance goals and it fails many of my user goals. Comments are central to many of my other goals, but all in all I think they’re the biggest problem with my site.

Here is a classic example of opposing goals. You have my need to provide comments going up against their shoddy implementation and my own goals of easy maintenance. It’s a big problem.

Big enough to disallow comments? No way. Big enough to try a new system? Maybe, if I can find one that eliminates my problems, is easy enough to implement and is able to satisfy the rest of my goals as well as my current system does.

Right now I don’t see anything like that. Here’s to a future of user-centered and (nearly) bullet-proof comments.

Filed under: Web Development

Comments

1. JC said:

I like the way the new version of WordPress lets you handle comments for spam. One of the options is essentially “let all comments with fewer than $number links through; notify and require approval for any with $number and higher prior to posting” – since spammers almost always add a large number of links, that catches the majority of them. And then you can also set a word list… so if a spammer advertises poker, or if Scrivs tells you in a comment he played poker yesterday, you’ll have to choose to approve or reject the comment. Now if it only had a way to determine if someone’s using l33t5p34|

Posted on June 15, 2004 01:08 PM | #

2. Tim Parkin said:

I often wonder where the line between spam and real content sites. I’d be happy if someone took the time to leave a contribution (ie an intelligent addition to the stream of converstaion) even if I *knew* they were only doing it to get pagerank. Very recently I’ve had to help clear up lots of wiki sandboxes of spam by various people (of which I was also shocked to see a high profile yorkshire web company included , they’d managed to spam over 60 wikis in the space of a day with links to over 20 of their clients). The nice thing is that a recent search showed that over 90% of the spam had already been removed less that 6 days later. As for questionable contributions, I’ll admit to having written content for online magazines with the primary intention of getting a link to my website. The question probably needs asking “should we just delete crap content regardless of spam or not?” or alternatively “should we delete good content just because the one of the intentions of the poster was to increase inbound links?”.

Posted on June 15, 2004 01:19 PM | #

3. SM said:

To me, the overwhelming problem with commenting systems are the slow-as-mud scripts. In fact, it’s the reason I so rarely post comments on anyone’s site—you never know how long it’s going to put a strangle hold on your browser. ‘Course there’s also that strange MT behavior that usually fails to confirm that your comment posted. In the end, it’s usually more of a hassle than it’s worth.

Posted on June 15, 2004 01:22 PM | #

4. Laurens said:

As for spam comments (if you get any automated ones at least), what about adding a select box with the label ‘What does the image to your left contain’ and options like ‘a car’, ‘a doll’, ‘an airplane’ (ok this sounds a bit childish ;p). Obviously the correct answer isn’t selected by default. If the right one isn’t selected, then the comment will not get posted.

One could even assume no-one to be stupid enough to choose one of *those* options wrong (*grin*) and impose a 5-minute IP ban or something to hold off any further tries. Though I think the latter should only be done if it appears to be really necessary.

Just an idea.

~Grauw

Posted on June 15, 2004 01:24 PM | #

5. Keith said:

JC – WP is high on my list to look into. As soon as I’ve got time. It does sound very, very good.

Tim – I’ve talked about this before. It’s a very tricky one that I end up chalking up to a judgement call. I don’t find if someone wants to link to their site, as long as the add to the discussion and I feel they aren’t trying to take advantage of me.

Usually it’s pretty easy to spot the spammer, but sometimes when I’m not sure, I just let the link stand. A spammer will usually get greedy and give themselves away – then I add to my blacklist and delete the whole lot!

Posted on June 15, 2004 01:26 PM | #

6. DarkBlue said:

You mean you’ve not looked at my “Ultimate Comments Handler” Keith? Please check it out, this system needs testing before I publish code…

Posted on June 15, 2004 01:37 PM | #

7. Keith said:

DarkBlue – You know I did look at it, but assumed I’d not be able to get it to work with MT. Don’t you roll your own system?

Keep in mind, with something custom like that, as well as with many plug-ins, I need something pretty easy to get up and running. Between my ineptitude when it comes to PHP and Perl, etc. and my hosts limitations this stuff can be really tricky.

Posted on June 15, 2004 01:49 PM | #

8. JB said:

CAPTCHA systems that try to discern users from computers tend to discriminate against vision-impaired users. Otherwise I like them.

How about mandatory email verification? A comment is submitted, but not public until an email gets replied to.

You can get creative with that: the usability problem is on the email client-end then. You can require the user add a word to the email subject in order to pass verification - not merely reply.

Mwahaha, or better, send the verification email with a spam subject, and see how the spammers like their own medicine. I’ll best most users here would add a whitelist entry for the site, and get a chuckle out of the joke. :P

Of course, I speak in “possibilities”. MT’s system might not do this yet, but that’s just easy code to be written.

Posted on June 15, 2004 02:14 PM | #

9. DarkBlue said:

Keith, Once I’m happy that it’s all working properly, I will publish the code. I do “roll my own” - but I don’t think that is a barrier. Once I publish the code, integrating it into MT shouldn’t be a problem for any half-decent Perl coder (perhaps Mr. Gruber could produce the MT plug-in?). Furthermore, the mechanism becomes visible, so PHP coders should be able to create derivatives too!

I do appreciate just how difficult these things are for non-programmers though, especially considering the multitude of hosting environments. That’s why software that includes modular, “plug-in” APIs (like MT, Wordpress, etc) are such good value for their users.

If I weren’t so damned busy, I’d get my hands on MT and have a go at the plug-in myself.

Posted on June 15, 2004 02:26 PM | #

10. DarkBlue said:

CAPTCHA systems that try to discern users from computers tend to discriminate against vision-impaired users. Otherwise I like them.

That’s true JB. I realised this and implemented an alternative mode for vision-impaired users…

How about mandatory email verification?

This is how my “alternative mode” works. Users can register, which involves an email verification, then log-in and never see the Captcha. This is all explained in my article, “Defending Against Comment Spam”.

Posted on June 15, 2004 02:34 PM | #

11. Ethan said:

Actually, WordPress’ text filters were one of the big selling points for me when I went CMS-shoppin’. The WP team seems to have done a fine job of bulletproofing comments against invalid markup; granted, I haven’t been over-generous with turning comments on in my (infrequent) posts, but I haven’t seen an invalid comment, or an XML-parsing error, yet.

Good luck, sir. ;)

Posted on June 15, 2004 02:38 PM | #

12. Anil said:

There’s MT plugins for limiting comments by number of links, or for making comments valid (email me if you want pointers), but it seems like the larger issue is identity for commenters. If you step back and take a look at the arms race in email, or the nascent problems with wiki spam, you can see how the pattern progresses, and the only tactics that succeed in the long term are authentication and identity, either technically imposed or demanded by a community. That’s the larger problem that needs to be solved.

Posted on June 15, 2004 03:32 PM | #

13. DarkBlue said:

the only tactics that succeed in the long term are authentication and identity

I am 100% in agreement with you Anil and have built my systems with enforced registration a switchable option.

However, in my (albeit limited) experience, people don’t register and comments die off when registration is enforced.

Why? Well I’d guess it’s because “comments”, by their nature, are spontaneous. As soon as we start to impose barriers, this spontaneity is inhibited and your audience doesn’t post.

I briefly used enforced registration on my website and, in one year, only 3 registrations were made (admittedly, I have a very small audience). As soon as I opened the comments handler, the discussions began to flow.

It’s strange this - since I have operated web bulletin-boards in the past and never had any problem with imposed registration. Can anyone explain why users would register for a forum, but not register for a comments handler?

Posted on June 15, 2004 04:04 PM | #

14. Adrian Rinehart-Balfe said:

When I used MT I had to go and deal with comment spam on a daily basis. It was always hidden away in the older posts and ended up being one of the things that sent me looking for a replacement CMS. I eventually chose Textpattern even though the whole thing works very differently from what I was used to. Forcing people to preview their comment before submitting seems (at least for now) to stop the spam. Having the comments only open for a set time without having to resort to a plugin seems to help also.

My guestbook used to suffer from the same thing until I changed the names of all the files and folders. How long that will work for is another question.

I’m not suggesting you make a big leap like I did, but you should keep it in mind.

99% of my comments and guestbook entries used to be spam, now it is none and, though I love it this way, I do feel kinda lonely!

Posted on June 15, 2004 05:46 PM | #

15. Brandon Walsh said:

I think the best way to prevent spam would be a single commenter registry. Commenters could register once and have the registration be valid on several (all?) blogs. Perhaps this would be doable with a plugin?

Posted on June 15, 2004 07:25 PM | #

16. DarkBlue said:

I think the best way to prevent spam would be a single commenter registry.

That won’t prevent the spammer - it will assist him. Instead of suffering the inconvenience of having to register at all the sites he wants to spam, he registers once with the central resource then is free to post whatever he wants, wherever he chooses!

Posted on June 15, 2004 07:42 PM | #

17. justin said:

Allowing anonymous comments while still maintaining the integrity of the comments posted is most certainly an oxymoron. With search engines linking all sorts of random folks to our websites, plus the spammers that get a hook on your URL and it just blows up (with spam) from there, you are practically forced to come up with some sort of user-interactive process to dissuade the pseudo-posters and keep the good comments.

Of all the various methods of forcing interactive behavior I’ve seen, the best (imo) I’ve seen involves sending a verification email to the email address submitted in the comment. Clicking a link in the email instantly adds the comment to public view. I’ve only seen it used on this site:

http://www.vintagebus.com/

But as of late, it looks like he changed his system. BTW, the site above is one of those “IE only” sites.

Anyway, your choices are pretty minimal, we’re talking:

1) Require visitors to register

2) Send verification email on per-post basis (or once per email, once authenticated… but this could allow abuse as well).

3) Some type of “what does the image say…” technique. This one’s a little pasé, sometimes image is hard to read.

4) No authentication, filters based on IP or something… unreliable and requires contstant maintenance.

I guess it’s up to the individual, the bigger your site is, the more advanced your tehcnique will need to be to deal with unwanted comments.

I’ve noticed the slow response time with comment submissions too, and all I can say is that’s CGI for ya. It’s very outdated and not the most efficient server-side programming paradigm on the block by any means.

I know ASP.NET is proprietary, but oh it is so, so good. Once the .NET page is compiled, the response time is extremely fast, even when we’re talking about multiple database operations and on-the-fly image generation. I know there is a lot of hardcore PHPers out there, and I feel like I’m missing a skillset by not knowning PHP as well as my ASP.NET (I’m purposely not mentioning classic ASP because I consider it lesser than PHP), but I’ve resisted because I don’t want to take a step back and learn a programming language which I feel is not as full-featured as current technologies.

Yeah I know I’m very biased, but it’s just my two cents on the matter. I’ve been using ASP/ASP.NET since ‘99 and it pretty hard for me to see the advantages of using anything else. I got into Perl /CGI for a little while, but it was clunky and I had to spend more time on compiling and checking permissions than what I was used to. PHP helps fill that void in the open-source realm, but it lacks the well-structured, object-oriented approach (which *can* give way to faster response times) that I yearn for.

There needs to be an open-source version of an ASP.NET equivalent to the PHP/CGI based MT. Who is working on that is what I want to know.

Posted on June 16, 2004 12:34 AM | #

18. Benedikt Müller said:

There needs to be an open-source version of an ASP.NET equivalent to the PHP/CGI based MT. Who is working on that is what I want to know.

You might take a look at this: www.go-mono.com

Considiering the Validation Issue - I think writing XHTML-conform comments is something you can’t expect from your everyday user/visitor. For someone who is not into the HTML-thing it’s just to complicated which will eventually lead to less comments/feedback on your site. On my Site I’ve tried using some kind of proprietary markup (ig. [url] and [/url] for links, …) but even that seems to be to much for most of my readers. So I think one good way to overcome would be to keep it simple: just allow blockquote and convert strings starting http:// to valid XHTML-Links before it’s saved in your DB.

Posted on June 16, 2004 03:40 AM | #

19. Benedikt Müller said:

Sorry justin - just noticed you were looking for an open-source Version of _Moveable Type_ using ASP.NET and not just open-source ASP.NET.

Posted on June 16, 2004 03:45 AM | #

20. mashby said:

RE: SPAM

I’m surprised this hasn’t been mentioned yet, but I have found that MT 3 handles comments MUCH better then 2.x and MT-Blacklist did. I upgraded the day it was released and I haven’t regretted it for a minute.

With MT3 all of your comments can be viewed in one interface and for the most part you can quickly tell which comments are SPAM and which ones aren’t. Granted, my weblog doesn’t get the traffic, or comments that yours does, but having all your comments in one place makes it easy to manage.

The way I have it set up is if you have a TypeKey, your comments appear immediately after posting. If you’re unauthenticated, just like your comments are here, then I have them held in suspense until I can approve them. I could turn the suspense off, but I haven’t felt compelled to do so at the present time.

If suspending comments until approval seems too much, you can always turn it off. Since the comments UI is so easy to use, you can just as easily go in after the fact and delete comment SPAM as needed.

The bottom line in my book is that SixApart did a pretty good job of trying to tackle the comment SPAM issue and I’ve very happy with the results. With the new pricing structure they released today, I think it’s worth taking a look.

Posted on June 16, 2004 06:01 AM | #

21. justin said:

Thanks Benedikt, I’ve got plenty of code working to create a MT-esque system, but it’s really a matter of putting it all together in a good working, portable version.

I’ve always used the approach of converting anything that starts with HTTP:// or WWW into a link automatically. This presents a content overflow issue for long URI’s, but that’s the downside. While allowing the <a> tag provides more flexibility, it also poses a risk that (imo) is too great on any site which gets a lot of traffic.

I need to look into MT once and for all.

Posted on June 16, 2004 09:31 AM | #

22. Scott Johnson said:

Have you considered running your “slow cgi scripts” under mod_perl? That might speed things up a bit.

Posted on June 16, 2004 12:50 PM | #

23. Dunstan said:

Having a forced preview on a comment system, and having moderation set on posts older than 7 days old results in virtually no spam at all, at least for me.

I just checked the last 317 emails I’ve received alerting me to new comments on my blog, and only one of them was spam.

And because it’s on an old post, it never appeared on the site.

Is there now MT-plugin you can get, Keith, to emulate this kind of behaviour?

Posted on June 16, 2004 11:30 PM | #

24. Chris Hester said:

I’ve always used the approach of converting anything that starts with HTTP:// or WWW into a link automatically. This presents a content overflow issue for long URI’s, but that’s the downside.

If you are processing the message then at some point your URI will be in a string, before having the ‘a’ tags placed around it, and the string pasted back into the message. If that’s the case then it will be dead easy to check the length of the string and if it’s over a certain length, trim it and add three dots to the end. (The physical link will stay the same of course as length isn’t a problem there.)

Posted on June 17, 2004 03:51 AM | #

25. Jacques Distler said:

Is there now MT-plugin…?

You can always remove the “POST” button from the comment form, but that has no effect on spambots.

There is, however, a robust plugin/hack to force previews in MT.

Posted on June 17, 2004 06:38 AM | #

26. KO said:

Ref #12: WP does make comment management easier than MT. What I don’t like is that all the necessary spam fighting tools are plugins for MT, not built it. For example, limiting the number of links is built in to Wordpress - I don’t want to add plugins for such basic functionality.

Posted on June 17, 2004 06:53 AM | #

27. Keith said:

Thanks for all the advice and comments everyone. I’ve got some ideas to help take care of some of these issues, at least to a certain extent – so thank you again!

Posted on June 17, 2004 09:38 AM | #

28. mashby said:

KO posted: I don’t want to add plugins for such basic functionality.

Try MT 3.0 Dev. It’s built in an no plugins are necessary.

Posted on June 17, 2004 10:01 AM | #

29. Matthijs Aandewiel said:

JB - on the email varification subject.. I believe that would be a bad way of doing things, personally, I would never take the time to reply to an email simply to leave my comment.

Oh, and JB.. thanks for the Visually Impared part of the comment, if I would build a system like that, I probably would have overlooked the visually impared part.. thanks.

Posted on June 21, 2004 06:40 AM | #

30. zoza said:

maybe tag rel=noref ??

Posted on June 4, 2005 12:23 PM | #

31. free dating services said:

SmartyPants is the good program for stopping comment spamming. Must be used by bloggers to safeguard their sites.

Posted on January 20, 2006 10:19 AM | #

Comments are now closed

Entry Archives

You are reading Comment System Dilemma posted on June 15, 2004 and filed under Web Development.

About the Author

is a Web designer and developer in Seattle, Washington. More »


7nights.com  Web


Old Stuff: