How Effective Is the Honeypot Technique Against Spam

How effective is the honeypot technique against spam?

It works relatively well, however, if the bot creator caters to your page they will see that (or even have a routine setup to check) and will most likely modify their bot accordingly.

My preference is to use reCaptcha. But the above will stop some bots.

Better Honeypot Implementation (Form Anti-Spam)

Concept

By adding a invisible field to your forms that only spambots can see, you can trick them into revealing that they are spambots and not actual end-users.

HTML

<input type="checkbox" name="contact_me_by_fax_only" value="1" style="display:none !important" tabindex="-1" autocomplete="off">

Here we have a simple checkbox that:

  • Is hidden with CSS.
  • Has an obscure but obviously fake name.
  • Has a default value equivalent 0.
  • Can't be filled by auto-complete
  • Can't be navigated to via the Tab key. (See tabindex)

Server-Side

On the server side we want to check to see if the value exists and has a value other than 0, and if so handle it appropriately. This includes logging the attempt and all the submitted fields.

In PHP it might look something like this:

$honeypot = FALSE;
if (!empty($_REQUEST['contact_me_by_fax_only']) && (bool) $_REQUEST['contact_me_by_fax_only'] == TRUE) {
$honeypot = TRUE;
log_spambot($_REQUEST);
# treat as spambot
} else {
# process as normal
}

Fallback

This is where the log comes in. In the event that somehow one of your users ends up being marked as spam, your log will help you recover any lost information. It will also allow you to study any bots running on you site, should they be modified in the future to circumvent your honeypot.

Reporting

Many services allow you to report known spambot IPs via an API or by uploading a list. (Such as CloudFlare) Please help make the internet a safer place by reporting all the spambots and spam IPs you find.

Advanced

If you really need to crack down on a more advanced spambot, there are some additional things you can do:

  • Hide honeypot field purely with JS instead of plain CSS
  • Use realistic form input names that you don't actually use. (such as "phone" or "website")
  • Include form validation in honeypot algorithm. (most end-user will only get 1 or 2 fields wrong; spambots will typically get most of the fields wrong)
  • Use a service like CloudFlare that automatically blocks known spam IPs
  • Have form timeouts, and prevent instant posting. (forms submitted in under 3 seconds of the page loading are typically spam)
  • Prevent any IP from posting more than once a second.
  • For more ideas look here: How to create a "Nuclear" honeypot to catch form spammers

How to create a Nuclear honeypot to catch form spammers

Build a really smart honeypot

That may seem obvious, but here are a few tricks(Details later):

  1. Think Like a spam bot
  2. Assume that they are able to know what is on screen or behind other elements
  3. have multiple traps.
    • Time Trap
    • Honey pot

1. Think like a spam bot:

Start going through your page like a spam bot, You can even write your own which can waist time but is quite fun :).
Most spam bots will crawl through the markup looking for a <form> element. Then they will look at your inputs and fill them in appropriately, which is the catch: how do they know what to fill in. They will prbably look at the Id, class, placeholder, and label. which brings us to our first method

Method #1:

Mis label inputs in your form code. Bascily your username input should have the Id of #Form_Email boom! spam bot fills out form incorrectly. Also hide and mislabel your inputs labels, use divs instead.*

Method #2 starts here

You've probably noticed that if you simply ignore hidden stuff, based on location what is in front of it and even the good old display: none;,visibility: hidden;,opacity: 0; or type='hidden'. This gives us a powerful weapon. I discovered this by accident while testing a time trap. I used a basic form filler to fill the form. On my site(I'm not talking about GiantCowFilms.com), the register form is in a dialog that opens when a user clicks a register button. By default it is hidden. This gave me an idea for

Method #2

Default: form is hidden. Basically, your form is hidden on page load, but is uncovered by some mouse based action(I don't think bots have mouses). If you wan't your form to be visible on page load, add a I identical decoy one which is above the real one in the markup.If the bot fills in and submits it, block its Ip for a few minuets.** For really users, simply when the mouse hovers over the decoy form, switch them around.

2. Assume that they know what your page looks like

Assuming that hiding honeypot with CSS is perfect is a grave mistake. Their are a lot of super smart screen readers like JAWS that could be repurposed for spaming. That is why you have multiple lines of defense.

3. Have multiple traps

  • Time Traps:
    Going back to thinking like a bot, would you wan't to wait on a site instead of attacking others?
    Method #3:Create a time trap.
    The best way is to print a time in a hidden input when the page loads. when you submit the form, it tells you how long it took. Fill the form as fast as you can. That should be the minimum amount of time to fill your for.Note: encrypt your time stamp so bots cannot change it.

    If you wan't to get really fancy, measure the WPM of the bot typing. This is done on stack exchange( try copy and pasting then submitting and question/answer). Also if the rate of typing is very consistent, that is a red flag.


  • Honeypots (Method #4):
    Use all of the above at once for best results. Make sure to trick dumb bots as well as smart bots (don't assume the bot is always trying hard.).

Now, in order to spam us, bots will have to have cursors, render the page, wait, type at a variable realist speed. If they make a bot like that, Then I guess it'll be Captcha time :(.

*People using screen readers will trigger or be confused by these defenses, and depending on your country you could get into trouble for discriminating against blind to semi-blind people. Therefor, when a user triggers the bot test, take them to a non loaded form with a disability friendly captcha like reCaptcha.

**People often share Ips and you can chase away valid users.

P.S. Use simple honey pots like you already have. Some bots are just too dumb to get tricked by what we have here.

Should a honeypot captcha be more complicated than 'display: none;'?

Unless spam is a serious problem on your blog, I'd just go for doing display: none.

You could also try the classic "What is 2 + 2" / "What color is the sky?" style questions.



Related Topics



Leave a reply



Submit