1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

Advantages of robots.txt

Discussion in 'robots.txt' started by HomeRun, Feb 24, 2008.

  1. #1
    What are the advantages of using robots.txt

    I read Shawn AKA digitalpoint post that he completly removed it from this site.
    Why? ". The more content Google has, the better so I figure it's just one of those things with running a forum..."

    http://forums.digitalpoint.com/showthread.php?t=2150

    What you guys think?
     
    HomeRun, Feb 24, 2008 IP
  2. cooldude7273

    cooldude7273 Active Member

    Messages:
    185
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    55
    #2
    The obvious advantage is you can keep search engines from indexing parts of your site. (You know... top secret stuff, personal business, stuff you don't want to see on Google)
     
    cooldude7273, Feb 24, 2008 IP
  3. sajidmm

    sajidmm Peon

    Messages:
    579
    Likes Received:
    10
    Best Answers:
    0
    Trophy Points:
    0
    #3
    You can stop search engine to index specific files or folder... or you can call them to index pages fast...
     
    sajidmm, Mar 5, 2008 IP
  4. ericajoieake

    ericajoieake Guest

    Messages:
    556
    Likes Received:
    6
    Best Answers:
    0
    Trophy Points:
    0
    #4
    robots.txt in very important to business or company sites, you can use this file to restrict search engines to index your confidential part of you website.
     
    ericajoieake, Mar 7, 2008 IP
  5. Andey

    Andey Member

    Messages:
    17
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    36
    #5
    so they dont go where you dont want them to go
     
    Andey, Mar 26, 2008 IP
  6. essex

    essex Guest

    Messages:
    49
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #6
    In order for google robots to index a "private part" of your site, there must first be a link on your site that points to the "private part". If I create a website and slip the file "private.php" in the root directory, google isnt going to magically know its there.

    This is just my personal opinion, not trying to flame.
     
    essex, Mar 31, 2008 IP
  7. phplife

    phplife Peon

    Messages:
    36
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #7
    Just to let you know that a robots.txt file only tells well behaved spiders not to index (i.e exclude) parts of your website. The robots.txt file does NOT offer any form of security or protection. Many spiders out there completely ignore the robots.txt file and will index everything.

    phplife
     
    phplife, Apr 23, 2008 IP
  8. manish.chauhan

    manish.chauhan Well-Known Member

    Messages:
    1,682
    Likes Received:
    35
    Best Answers:
    0
    Trophy Points:
    110
    #8
    Exactly....there are some spammy robots that do not respect robots.txt instructions..if you have blocked them by robots.txt....they will still follow the whole pages of your site..
    In this case you can track their IP addresses by your traffic log and then block them using .htaccess...
    It will also help you to cut down your bandwidth usage as half of your website bandwidth is used by these spammy bots if you not block them..:)
     
    manish.chauhan, Apr 23, 2008 IP
  9. Lovely

    Lovely Well-Known Member

    Messages:
    2,997
    Likes Received:
    18
    Best Answers:
    0
    Trophy Points:
    155
    #9
    Another advantage is that it can be used to block search engine spiders from indexing part or all of your website saving valuable bandwidth
     
    Lovely, Apr 28, 2009 IP
  10. mubheer

    mubheer Active Member

    Messages:
    340
    Likes Received:
    6
    Best Answers:
    0
    Trophy Points:
    58
    #10
    Don't you think you need to keep some parts of your website out of the reach of search engines ?. I sue robots.txt to restrict certain parts of my website from being indexed or known to public

    for eg: the bin directory, dataabse etc

     
    mubheer, Apr 28, 2009 IP
  11. linkdealer

    linkdealer Active Member

    Messages:
    138
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    90
    #11
    Primary reason of the same in not to save bandwidth, however, to keep our private files secure from search engines
     
    linkdealer, Jun 9, 2009 IP
  12. morg

    morg Peon

    Messages:
    90
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #12
    just don't use this to exclude your downloads area if anyone has access to robots.txt because someone can dl your stuff
     
    morg, Jun 13, 2009 IP
  13. rvitgroup

    rvitgroup Peon

    Messages:
    45
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #13
    hi manish, i found your post very helpful. How to use .htaccess , Hope to hear from your side early.
     
    rvitgroup, Jul 30, 2011 IP
  14. manish.chauhan

    manish.chauhan Well-Known Member

    Messages:
    1,682
    Likes Received:
    35
    Best Answers:
    0
    Trophy Points:
    110
    #14
    Thanks rvitgroup, I am glad your liked my reply. I would request you to drop me an message with your queries so that I can assist you further. Looking forward to hear from you soon.
     
    manish.chauhan, Aug 1, 2011 IP
  15. wetbupa

    wetbupa Peon

    Messages:
    119
    Likes Received:
    2
    Best Answers:
    1
    Trophy Points:
    0
    #15

    The robots.txt file is a simple text file (no html) that is placed in your website’s root directory in order to tell the search engines which pages to index and which to skip.
    Many webmasters utilize this file to help the search engines index the content of their websites.
    If webmasters can tell the search engine spiders to skip pages that they do not consider important enough to be crawled (eg. printable versions of pages, .pdf files etc.), then they have a better opportunity to have their most valuable pages featured in the search engine results pages. The robots.txt file is a simple method of essentially easing the process for the spiders to return the most relevant search results.


    That being said, I have seen many occasions where the robots.txt has not been used in the best way possible. For instance, webmasters are prone to make mistakes when installing the robots.txt and the repercussions can be severe. There is a simple instruction that restricts all search engine spiders from crawling the entire site:
    User-agent: *
    Disallow: /
    Without the “forward slash” in the instructions, search engines are granted access to the entire site. So, the inclusion of this one character in the robots.txt can prevent a website from showing in the search engines. There could be many reasons why webmasters would do this intentionally (website is still relatively new and they may still want to tweak certain pages for keyword density etc.), but more often than not, it is a mistake and is usually only realized when the site hasn’t shown up in the search engine indexes for months.
    Errors aside, another benefit of having a robots.txt is that you can specify the location of the Google .xml or Yahoo sitemap with this simple instruction:
    sitemap: http://www.client.com/sitemap.xml (this assumes the xml sitemap is located at the root of the domain).
    This also increases spiderability for the search engines. Of course, even though this is a small aspect of the search engine optimization process, if utilized correctly, a robots.txt can be a significant benefit.
     
    wetbupa, Oct 17, 2012 IP
  16. silenthawk

    silenthawk Member

    Messages:
    38
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    41
    #16
    Nice info, Its not necessary but advantageous!
     
    silenthawk, Oct 21, 2012 IP
  17. lordofblogger

    lordofblogger Peon

    Messages:
    32
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #17
    robots.txt provides you control to access and deny crawling of specific contents from your, to learn all syntex properly please search on google about robots.txt
     
    lordofblogger, Oct 24, 2012 IP
  18. TheCreator

    TheCreator Banned

    Messages:
    372
    Likes Received:
    2
    Best Answers:
    0
    Trophy Points:
    53
    #18
    By using robot.txt you can mention your sitemap URL there, also you can allow and disallow links from google crawling, etc...
     
    TheCreator, Oct 26, 2012 IP
  19. selectaupairs

    selectaupairs Greenhorn

    Messages:
    59
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    8
    #19
    Robot.txt is very useful why you ant to search engine don't crawl your site .

    One most advantages is that using robot.txt your admin panel must be keep away from crawl .
    So you fight with the hackers.
     
    selectaupairs, Oct 30, 2012 IP
  20. Samu66el

    Samu66el Peon

    Messages:
    7
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #20
    The robots.txt file does NOT offer any form of security or protection.[​IMG]
     
    Samu66el, Oct 30, 2012 IP