I've just started playing with a different seo programme, IBP. I see on their page report they say the following. 'Your web pages uses the meta robots tag to allow search engines to index your web page. Actually you can remove this tag as search engines will still index your web page if this tag is missing' Is this correct??
Robots.txt is absolutely not necessary, however it is recommended to use. But it is not necessary; if there is no robots.txt file, the SE will index all of the pages the SE can find on your website.
I hardly recommend you to use robots.txt (including complete blank one), because some web hostings do not correctly return HTTP 404 when robots.txt is not found and Google in this case does not start to index your site (really - I am sure - it is info from Google staff).
This isn't strictly a robots.txt issue. You're talking about a meta tag that goes within the <head></head> part of your pages. What IBP say is absolutely correct. You don't need this tag for search engines to index your page.
What are you talking about man? robots.txt is a joke. It's like saying the the bots "I'd prefer you stay out, but if you want to come in, there's really nothing I can do to stop you". It's utterly pointless, and really doesn't do much of anything to my understanding.
The robots.txt protocol is not a joke. It's a useful tool to prevent certain bots from spidering certain pages/sections of sites while allowing others through. It's also a lot easier to modify a robots.txt file than it is to update dozens, hundreds or even thousands of Web pages to make a single change. Also, unlike the META tag, the robots.txt file can block SPECIFIC search engine spiders from crawling and indexing your site (didn't I say that already?). And has already been mentioned, not having a robots.txt file can clutter your server logs with needless 404 error returns (just as not having a favicon.ico file will generate 404 errors since they get sent out with the rest of the Web page when it's requested by the user agent - which is in most cases a traditional Web browser).
I am new to this, so need some help. I do not need to block any part of my site to being crawling by SE. But I do need to handle the issue of 404, as what you said "not having a robots.txt file can clutter your server logs with needless 404 error returns " So, could you please help to post a simple robot.txt here so that I just upload it to my website. regards
Upload these two lines in a file named robots.txt: User-agent: * Disallow: This allows all bots in, everywhere.
And be sure to put it in your main HTML folder (usually html_public, could be www/web or something else, depending on your server's OS version).
If I recall correctly, you can replace Disallow: with Allow: / However, this isn't a very good idea. You're going to want to block SOME things, specifically paths to your stylesheets and scripts.