Why Robots.txt?

Discussion in 'HTML & Website Design' started by sanchyclub, Jul 16, 2011.

  1. #1
    hey, why robots.txt file is important?

    what is the mean of-

    User-agent: *
    Allow: /

    and

    User-agent: *
    disallow: /

    I'm really so much confused! your answer will be highly appreciated!
    thanks,
     
    sanchyclub, Jul 16, 2011 IP
  2. Hurdanstore

    Hurdanstore Peon

    Messages:
    10
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #2
    Robots.txt file is useful for search engines like google, yahoo, bing etc. That file was created to avoid lots of websites without no content or testing sites to be indexed on search engines. And that file will be automatically allow indexing your website after some specific time, to verify that your site is original lol
    This answer may help you
     
    Hurdanstore, Jul 16, 2011 IP
  3. jjosephs

    jjosephs Greenhorn

    Messages:
    27
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    13
    #3
    It's import for SEO. I use it to block all spiders and crawlers when I have a project online while it's in the development stage. You can also use it to block content you don't want appearing in search results, for example if I have a captcha folder for storing the relevant php scripts or an includes folder for storing all the reusable components on a website I'd use the following commands.

    Disallow: /captcha
    Disallow: /includes
     
    jjosephs, Jul 16, 2011 IP
  4. sanchyclub

    sanchyclub Well-Known Member

    Messages:
    625
    Likes Received:
    6
    Best Answers:
    2
    Trophy Points:
    105
    #4
    sorry! I'm not so clear...

    If I write-

    User-agent: *
    Allow: /

    -that means all contents are open for search engines?
    and suppose I don't want to block anything then what will be the task for this robot?
     
    sanchyclub, Jul 16, 2011 IP
  5. codebreaker

    codebreaker Well-Known Member

    Messages:
    281
    Likes Received:
    5
    Best Answers:
    0
    Trophy Points:
    110
    #5
    If you don't want to block anything you don't need that file.
     
    codebreaker, Jul 16, 2011 IP
  6. jjosephs

    jjosephs Greenhorn

    Messages:
    27
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    13
    #6
    Yes.

    User-agent: *
    Allow: /

    Means that ALL search engines are allowed to crawl and index ALL your site's content


    User-agent: *
    disallow: /

    Means that NO search engines are allowed crawl and index AN of your site's content


    If you want everything to be indexed you can use the first command or omit the robots.txt file altogether.

    Note: Crawlers and Spiders can ignore the commands in the robots.txt file.
     
    jjosephs, Jul 16, 2011 IP
  7. sanchyclub

    sanchyclub Well-Known Member

    Messages:
    625
    Likes Received:
    6
    Best Answers:
    2
    Trophy Points:
    105
    #7
    got it, thanks, thanks for all of your comments guys.
     
    sanchyclub, Jul 17, 2011 IP
  8. Nijil

    Nijil Peon

    Messages:
    68
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #8
    Robot.txt tell the Search engine to follow this link and Crowle your page.
     
    Nijil, Jul 18, 2011 IP
  9. Prem Asok

    Prem Asok Greenhorn

    Messages:
    12
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    11
    #9
    @jjosephs This is really informative, I guess robots.txt is kind of a control the web developer has for making the search engines crawl the pages, the developer wants.

    tnx
     
    Prem Asok, Jul 19, 2011 IP