What to Allow and Disallow on robots.txt

Discussion in 'robots.txt' started by astrohope, Jul 5, 2010.

  1. #1
    What to Allow and Disallow on robots.txt in PHP index site
     
    astrohope, Jul 5, 2010 IP
  2. Imozeb

    Imozeb Peon

    Messages:
    666
    Likes Received:
    3
    Best Answers:
    0
    Trophy Points:
    0
    #2
    Don't use allow because everything is default allow.

    Disallow your moderator pages, cgi bin, tmp folder.

    I would need to take a look at your sites directory to tell you more.
     
    Imozeb, Jul 5, 2010 IP
  3. manish.chauhan

    manish.chauhan Well-Known Member

    Messages:
    1,682
    Likes Received:
    35
    Best Answers:
    0
    Trophy Points:
    110
    #3
    It all depends on you, if there is nothing in your website that you want to hide then you can use following code in your robots.txt

    User-agent: *
    Disallow:

    OR

    User-agent: *
    Allow: /

    and in case if you want to restrict some pages/folders, you can use following code:

    User-agent: *
    Disallow: /restricted-pages.php
    Disallow: /restricted-folder/
     
    manish.chauhan, Jul 8, 2010 IP
  4. astrohope

    astrohope Peon

    Messages:
    108
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #4
    Thanks .. i Generate a robots.txt this
    User-agent: *
    Disallow: /cgi-bin/
    Disallow: /tmp/
    Disallow: /junk/
    Disallow: /css/
    Disallow: /date/
    Disallow: /email/
    Disallow: /img/
    Disallow: /omer/
    Disallow: /picture_library/
    Disallow: /public_html/

    Sitemap: http://www.astrohope.com/public_html/sitemap.xml

    and Status is 200 (Success)

    What you think ?
     
    astrohope, Jul 9, 2010 IP
  5. manish.chauhan

    manish.chauhan Well-Known Member

    Messages:
    1,682
    Likes Received:
    35
    Best Answers:
    0
    Trophy Points:
    110
    #5
    This is cool :)
     
    manish.chauhan, Jul 11, 2010 IP
  6. smartamit04

    smartamit04 Peon

    Messages:
    97
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #6
    hi,
    it is up to you that what you want to not index in google. simply disallow all those file or folder that you want to hide from Search Crawler.
     
    smartamit04, Aug 5, 2010 IP
  7. joseph pop

    joseph pop Guest

    Messages:
    71
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #7
    Hello,

    Index Page won't be crawled except the page is accessed with the full qualified path.

    User-agent: *
    Disallow: /
    Allow: /index.php
    Allow: /$
     
    joseph pop, Aug 5, 2010 IP