Robots.txt Files

Discussion in 'Search Engine Optimization' started by savantcreative, Feb 27, 2008.

  1. #1
    Should all websites have a robots.txt file? What is the best way to learn about this?
    Thanks-
     
    savantcreative, Feb 27, 2008 IP
  2. ~DaRk-EyE~

    ~DaRk-EyE~ Banned

    Messages:
    59
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #2
    Robot.txt files are used to inform SE spiders of ignoring specified files or directories in their search...
    One good example is when the content of the selected directories might be misleading or irrelevant to the categorization of the site as a whole, in this case you have to indicate in the Robot.txt file that this part of your site must be excluded for crawling...
     
    ~DaRk-EyE~, Feb 27, 2008 IP
  3. KC TAN

    KC TAN Well-Known Member

    Messages:
    4,792
    Likes Received:
    353
    Best Answers:
    0
    Trophy Points:
    155
    #3
    There is no need for the robots.txt if you want Google to index all your content :)
     
    KC TAN, Feb 27, 2008 IP
  4. dj_gie

    dj_gie Peon

    Messages:
    327
    Likes Received:
    3
    Best Answers:
    0
    Trophy Points:
    0
    #4
    They should really. You can put a link to your sitemap via them as well. Very handy to have.
    Go to http://www.robotstxt.org/ for more information on robots.txt
     
    dj_gie, Feb 27, 2008 IP
  5. astup1didiot

    astup1didiot Notable Member

    Messages:
    5,926
    Likes Received:
    270
    Best Answers:
    0
    Trophy Points:
    280
    #5
    It's very hard to imagine any site that wants "all" the web pages indexed in Google, but some people just don't know better. Your better off atleast using the robots.txt file to point to your sitemap, since search engines like Ask.com only use this method to locate it, below is a simple robots.txt entry you can use.

    
    User-agent: *
    Disallow:
    sitemap: <full www path to sitemap>
    
    Code (markup):
     
    astup1didiot, Feb 27, 2008 IP
  6. Dan Schulz

    Dan Schulz Peon

    Messages:
    6,032
    Likes Received:
    437
    Best Answers:
    0
    Trophy Points:
    0
    #6
    Not only that, but not having a robots.txt file will cause your server's error logs to get congested and polluted pretty darn quick (PDQ). So if you don't want to go through false positives like "no robots.txt file" or "no favicon.ico file" then put them in there and be done with it, even if you don't use them.
     
    Dan Schulz, Feb 27, 2008 IP
  7. budhanes

    budhanes Peon

    Messages:
    245
    Likes Received:
    6
    Best Answers:
    0
    Trophy Points:
    0
    #7
    Over the years I have added a pretty simple robots.txt. As time progresses, I find myself adding to it on occassion to disallow roge robots.
     
    budhanes, Feb 27, 2008 IP