Can you use robots.txt to disallow crawling of outbound links? I know Disallow / would be acceptable but could i do something like: User-agent: * Disallow: www.somewebsite.com to prevent spidering of outbound links? or can this be done in mod_rewrite?
I'm not sure if it's possible using robots.txt, but this site might help answer your question. http://www.robotstxt.org/
Hi solarpanelsdirect, You can not instruct crawlers in robots.txt nor in htaccess not to follow outbound links. However, you can do it using rel="nofollow" tag with the outbound links.
Yes, we can also control the robots by HTML nofollow attribute. However the technique will not work out for bandwidth loss. And the genuine robots only follow this kind of protocol standards, others may not. Preventing unwanted robots in any manner will help you to save your website's bandwidth.