1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

How to crawl robot txt file !

Discussion in 'Google' started by myindiahub, Apr 14, 2011.

  1. #1
    if we allow all crawler in robot txt file and display most page url which we want to crawl fast then bot will crawl it fast.
     
    myindiahub, Apr 14, 2011 IP
  2. deepikawalecha

    deepikawalecha Peon

    Messages:
    250
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #2
    and how to stop google to index any page using robot.txt
     
    deepikawalecha, Apr 14, 2011 IP
  3. longcall911

    longcall911 Peon

    Messages:
    1,672
    Likes Received:
    87
    Best Answers:
    0
    Trophy Points:
    0
    #3
    There is no 'allow' command for robots.txt.

    So, if you list a page it can only be disallowed!

    To ensure that pages are crawled, use the tool that is made for doing just that, a site map file.
     
    longcall911, Apr 14, 2011 IP
  4. Nishail

    Nishail Peon

    Messages:
    3
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #4
    You can also put URL of xml sitemap on robots.txt file.
     
    Nishail, Apr 14, 2011 IP
  5. websitetools

    websitetools Well-Known Member

    Messages:
    1,513
    Likes Received:
    25
    Best Answers:
    4
    Trophy Points:
    170
    #5
    Like Nishail says.

    You can see an example of using robots.txt with xml sitemaps. Using robots.txt for is called "xml sitemaps autodiscovery"
     
    websitetools, Apr 14, 2011 IP
  6. bogs

    bogs Active Member

    Messages:
    2,142
    Likes Received:
    16
    Best Answers:
    0
    Trophy Points:
    80
    #6
    if you want to crawl your site then you don't have to indicate or create robots.txt.. the default of it will be allowing bots to crawl all your pages..
     
    bogs, Apr 14, 2011 IP
  7. Danny0109

    Danny0109 Peon

    Messages:
    157
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #7
    bogs is right.
    And to check your bot.txt you can use google webmaster tools.
     
    Danny0109, Apr 14, 2011 IP
  8. hughthomas

    hughthomas Peon

    Messages:
    8
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #8
    good idea for the sitemap in the .txt. file, just make sure that you have updated everything in webmaster tools first. E.g. added the robots.txt. file, submitted the sitemap, and occasional recalculation of sitemaps and resubmits. Just let Google know what your doing basically.
     
    hughthomas, Apr 15, 2011 IP
  9. C.Rebecca

    C.Rebecca Active Member

    Messages:
    1,401
    Likes Received:
    11
    Best Answers:
    1
    Trophy Points:
    65
    #9
    If you want something not to get indexed,

    add this in your robots.txt:
    Disallow: <path of file or directory that you dont want SEs to crawl>

     
    C.Rebecca, Apr 15, 2011 IP
  10. sunnyverma1984

    sunnyverma1984 Well-Known Member

    Messages:
    342
    Likes Received:
    7
    Best Answers:
    0
    Trophy Points:
    120
    #10
    to block google from indexing any page add this in you robots.txt

     
    sunnyverma1984, Apr 15, 2011 IP
  11. myindiahub

    myindiahub Member

    Messages:
    358
    Likes Received:
    2
    Best Answers:
    0
    Trophy Points:
    28
    #11
    thanks to all to replay your valuable suggestion but i am notice that my website is crawling regular but robots.txt is not crawl and other website robot.txt file is crawl time to time.
     
    myindiahub, Apr 16, 2011 IP