Hi, a robots.txt question. thanks

Discussion in 'Site & Server Administration' started by danieloffice, Jul 15, 2007.

  1. #1
    Hi,

    Am I correct?
    I want to put a robots.txt in my site www.arghcade.com
    I put the following as below.
    1. sitemap location
    2. User-agent:* (actually I dont' know what it is for, but someone advise me to do that)
    3. I want to ban the SE to crawl my directly /funny
    4. I wan to ban the SE to crwal my "add-on domain" /allfunny

    So, I put the following line in my robots.txt

    Please help to comment if I am OK. Thanks.


    --- robots.txt --

    Sitemap: http://www.arghcade.com/sitemap.xml
    User-agent: *
    Disallow: /funny
    Disallow: /allfunny
     
    danieloffice, Jul 15, 2007 IP
  2. Janissary

    Janissary Well-Known Member

    Messages:
    375
    Likes Received:
    3
    Best Answers:
    0
    Trophy Points:
    140
    #2
    it looks OK
     
    Janissary, Jul 15, 2007 IP
  3. danieloffice

    danieloffice Peon

    Messages:
    472
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #3
    what does the following line means in robots.txt, thanks


    User-agent: *
     
    danieloffice, Jul 15, 2007 IP
  4. inworx

    inworx Peon

    Messages:
    4,860
    Likes Received:
    201
    Best Answers:
    0
    Trophy Points:
    0
    #4
    It means all bots.
     
    inworx, Jul 16, 2007 IP
  5. danieloffice

    danieloffice Peon

    Messages:
    472
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #5
    Do you mean that the site allows "all SE" to crawl. (ie what you mean by "all bots"....?

    thanks
     
    danieloffice, Jul 17, 2007 IP
  6. odysseus

    odysseus Peon

    Messages:
    881
    Likes Received:
    20
    Best Answers:
    0
    Trophy Points:
    0
    #6
    Does this also mean that the bots will crawl to all pages other than those Disallowed?
     
    odysseus, Aug 5, 2007 IP
  7. Nipon

    Nipon Peon

    Messages:
    17
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #7
    User-agent: * means all robots will follow the instructions

    if u write User-agent: Googlebot then only googlebot ll follow the instructions

    Disallow: /dir will instruct bot not to crawl 'dir'

    yes.
     
    Nipon, Aug 8, 2007 IP
  8. web_mehul

    web_mehul Banned

    Messages:
    248
    Likes Received:
    10
    Best Answers:
    0
    Trophy Points:
    0
    #8
    User-agent: * is ok will invite all the robots to crawl your site.
    Disallow: /dir will instruct not to crawl this directory and for all subdirectories to the robots.

    :)
     
    web_mehul, Aug 8, 2007 IP