Mortgage - Babb Fest - Online Loans - Loans - Online Loans

PDA

View Full Version : Robots.txt?


tweetylover8402
Oct 25th 2005, 2:11 pm
What's Robots.txt used for?

On CPanel, it's the #1 'not found' page. What is it used for, why?

mcfox
Oct 25th 2005, 2:14 pm
It tells the search engine spiders what they can and can't look at on the site.

Just upload a blank text file made with notepad called robots.txt and the errors will disappear.

seo-ireland
Oct 25th 2005, 2:19 pm
Also check out this tutorial (http://www.searchengineworld.com/robots/robots_tutorial.htm) to learn how to use the robots.txt file.

Good luck.

Cricket
Oct 25th 2005, 2:22 pm
When a robot crawls your site it looks for the robots.txt file. If it doesn't find one it assumes automatically that it may crawl and index the entire site. Not having a robots.txt file can also create unnecessary 404 errors in your server logs, making it more difficult to track "real" 404 errors.

Assuming you want your entire site indexed and only want to stop the unnecessary 404 errors from occurring you have a couple of options.

Upload a blank robots.txt file to the root directory of your domain.
Upload a simple robots.txt file to the root directory of your domain.===========

I have an article on my site that covers the BASICS of how to Create Robots.txt File (http://www.gnc-web-creations.com/creating_robotstxt_file.htm)that may help you get started.




Cricket :)

tweetylover8402
Oct 25th 2005, 2:23 pm
TY for your kind responses. :)

But, why would you never want to stop a spider from hitting the page?

Cricket
Oct 25th 2005, 2:25 pm
Also check out this tutorial (http://www.searchengineworld.com/robots/robots_tutorial.htm) to learn how to use the robots.txt file.

Good luck.

Ooops! Sorry! I didn't realize you had already answered this. Our posts must have crossed paths :o


Cricket

seo-ireland
Oct 25th 2005, 2:47 pm
No probs Cricket.

But, why would you never want to stop a spider from hitting the page?

There are lots of reasons you may want to keep spiders away but the biggest reasons are to keep the spiders away from sensitive data or pages you do not want in the database and to also give the spider more direction so that it only reads the pages you want listed in their database.

minstrel
Oct 26th 2005, 8:03 pm
Yes. To prevent it spidering your images, scripts, stats, mail, etc.

By the way, a basic "spider everything" robots.txt file would look like this:

User-agent: *
Disallow:

Save as plain text (ASCII/ANSI) and upload it to the ROOT of your site.

This translates to "all spiders, please crawl everything (disallow nothing)".

Bibofa
Oct 30th 2005, 12:52 am
User-agent: *
Disallow:

I;m using this

sufi
Nov 3rd 2005, 4:23 am
Hi Guys, Whats the command to stop spiders and robots to crawl image files as it eats up bandwidth.

Thanks

minstrel
Nov 3rd 2005, 5:41 am
Under:

User-agent: *
add this line:

Disallow: /images/
substituting the name of your images folder for "images".

If your images aren't in a separate folder, add lines like this instead:

Disallow: /image1.gif
Disallow: /image2.gif
Disallow: /image3.jpg

PYJAMA
Nov 3rd 2005, 6:15 am
minstrel,
U Truely R a minstrel! Well Said!

-PYJAMA

deepnuke
Nov 10th 2005, 11:48 pm
have a good robots.txt for increase the rangking on search engine??

minstrel
Nov 11th 2005, 7:03 am
No. The robots.txt file is a "limiter" for spiders -- it tells them what parts of your site you do not want them to crawl/index.

There is nothing you can put in a robots.txt file to increase search engine ranking.

jazzylee77
Nov 13th 2005, 6:41 am
Is there a robots text file that will make me run faster? Jump Higher?