Greetings, I have a simple PHP-based counter which counts a unique visitor once per day, per page. The MySQL database records each visitors IP address so a visitor is only counted once per day. The problem is I think this counter is counting search engine bots and is recording a lot of false hits. Does anyone know how to prevent this? I was thinking of trying to match the IP to a list of search engine bot IPs to exclude those visitors, but I'm guessing there's thousands of IPs to track down. Please help. Thanks
Add this to your robots.txt file: Disallow: /path/to/counter.php Code (markup): It's not guaranteed that it will block every bot, but it will block all respectable search engines like google, yahoo, bing etc.
I don't think this will work since the counter is a script that I built into my pages. It gets loaded along with the rest of the page in PHP. Let me know what you think about this. Thanks Kind regards
You could go off the HTTP_USER_AGENT and compare it to a list of bot user agents. It'll never be 100% accurate but will prevent counting the major bots.
I wouldn't use Bot IP Addresses. Mainly because they do in fact change and every day your list will become outdated. The maintenance would outweigh the benefit. Since you are just trying to avoid counting KNOWN bots as HITS I would simply get a list of known bot HTTP_USER_AGENTS and create a compare function to detect if the visitor is a bot. Like I said it will never be 100% accurate on a global scale...but for your site it may be 99.99% accurate.
Thanks for the tip on trying to match user agents vs IP addresses. I'll see how well this will work. Kind regards
Alternatively you could load your counter.php file via javascript. nowadays most people have js enabled but bots don't do js.