What is robots.txt?????????

Discussion in 'Search Engine Optimization' started by Soniaferdous, May 15, 2010.

  1. #1
    What this txt file contain?????????Why it's necessary for SEO does anyone can help to get the actual answer.????????????
     
    Soniaferdous, May 15, 2010 IP
  2. bordello

    bordello Notable Member

    Messages:
    3,204
    Likes Received:
    141
    Best Answers:
    0
    Trophy Points:
    290
    #2
    robots.txt is a file which tells robots ( search engine crawlers ) which pages they should visit and which they should not.....this file we put in root of our site & can easily be accessible by visiting www.yourdomain.com/robots.txt
     
    bordello, May 15, 2010 IP
  3. sherone

    sherone Well-Known Member

    Messages:
    1,539
    Likes Received:
    16
    Best Answers:
    0
    Trophy Points:
    130
    #3
    A robots.txt file restricts access to your site by search engine robots that crawl the web. You need a robots.txt file only if your site includes content that you don't want search engines to index. If you want search engines to index everything in your site, you don't need a robots.txt file (not even an empty one).Google bot won't crawl or index the content of pages blocked by robots.txt.
     
    sherone, May 15, 2010 IP
  4. Soniaferdous

    Soniaferdous Peon

    Messages:
    9
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #4
    Thank u very much...Now it's clear to me.
     
    Soniaferdous, May 15, 2010 IP
  5. slimjim2010

    slimjim2010 Peon

    Messages:
    245
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #5
    It's when a robot sends you a text message.
     
    slimjim2010, May 15, 2010 IP
  6. websitedevelopment

    websitedevelopment Peon

    Messages:
    231
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #6
    Robot.txt is text file where we can inform to search engine crawlers that we want to crawl entire website and some time if we don't want to crawl any page then we can make it none crawl by search engines.

    In short we can use rebot.txt as command file for search engines.
     
    websitedevelopment, May 15, 2010 IP
  7. Serious Workers

    Serious Workers Well-Known Member

    Messages:
    2,785
    Likes Received:
    65
    Best Answers:
    2
    Trophy Points:
    195
    #7
    Search engines use to crawl your site fully and when you do not want any of your page dis allow them, you can add that to it.
     
    Serious Workers, May 15, 2010 IP
  8. hostsaker

    hostsaker Well-Known Member

    Messages:
    153
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    133
    #8
    answers very clear, tnx for the ups =)
     
    hostsaker, May 15, 2010 IP
  9. TheGoogleGurus

    TheGoogleGurus Guest

    Messages:
    52
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #9
    TheGoogleGurus, May 15, 2010 IP
  10. ethan1066

    ethan1066 Peon

    Messages:
    67
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #10
    thanks for telling the robot.txt in such a good manner...i was always having a wrong perception about the robot.txt...which is very much clear after reading this thread...
     
    ethan1066, May 16, 2010 IP
  11. neerajseo

    neerajseo Peon

    Messages:
    170
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #11
    for blocking google spider our website site pages and images which we do not want index...and want more details about of robots.txt then search in google....
     
    neerajseo, May 17, 2010 IP
  12. sanjeev.me232

    sanjeev.me232 Active Member

    Messages:
    90
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    51
    #12
    robots.txt is a text file which is specially used for Google spider.Google spider crawl your site for little bit sec.so within this period is not possible to crawl all the pages of site.by using robots.txt file we disallow the some pages where there is no value.you can use http://www.yoursite.com/robots.txt.you can also get more information from this http://www.robotstxt.org/wc/robots.html
     
    sanjeev.me232, May 17, 2010 IP
  13. greenoro0

    greenoro0 Active Member

    Messages:
    435
    Likes Received:
    2
    Best Answers:
    0
    Trophy Points:
    70
    #13
    The Robot Exclusion Standard, also known as the Robots Exclusion Protocol or robots.txt protocol, is a convention to prevent cooperating web ...

    A text file present in the root directory of a site which is used to control which pages are indexed by a robot. Only robots which comply with the Robots Exclusion Standard will follow the instructions contained in this file.

    A text file located in the root directory to restrict search engine spider's access to certain files or folders of a web site.

    a file used to exclude some or all robots from crawling some or all the files or directories on a website. ...

    Robots.txt is a file which spiders read to determine which parts of a website they may visit and may not visit
     
    greenoro0, May 17, 2010 IP
  14. devashishseo

    devashishseo Active Member

    Messages:
    244
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    71
    #14
    While Google won't crawl or index the content of pages blocked by robots.txt, we may still index the URLs if we find them on other pages on the web. As a result, the URL of the page and, potentially, other publicly available information such as anchor text in links to the site, or the title from the Open Directory Project can appear in Google search results.
    In order to use a robots.txt file, you'll need to have access to the root of your domain. If you don't have access to the root of a domain, you can restrict access using the robots meta tag.
     
    devashishseo, May 19, 2010 IP
  15. Jeff Collision

    Jeff Collision Peon

    Messages:
    1,020
    Likes Received:
    4
    Best Answers:
    0
    Trophy Points:
    0
    #15
    Sometimes, a content from your website can be copied to any blog submission pages. you can able to know that by checking. So you can disallow the duplicate copy of your content using robots.txt Then, there is no need to visit your cached pages to be visited by search engine bots. You can also disallow those pages using robots.txt
     
    Jeff Collision, May 19, 2010 IP