1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

What is Robots.txt ?

Discussion in 'robots.txt' started by alka007, Feb 23, 2012.

  1. #1
    Sometimes the companies don’t want the spider or search engines to index data that is present on their websites. there can be thousand reasons for this the website may contain data that is sensitive and personal and company doesn’t want that data to be disclosed or you want to exclude images or style sheets to save band width or any other reason for accomplish this these companies inform the search engines to avoid tit and make use of robot meta tags or Robots.txt file.

    Robot metatags have there own limitations and they may go unnoticed so mostly Robots.txt file format is used the coding is simple it is a never ending list of user agents and disallowed files and directories. Basically, the syntax is as follows:

    User-agent:
    Disallow:
    User agents are search engines and spiders where as disallow refers to content that should not be exposed to the public. Sometimes statements are also made as
    User-agent: *
    Disallow: /temp/
    Robots.txt file don’t provide real time safety as they provide are no firewall or password protections but merely asks the user not to log to this information how ever the user might or might not try to get access to the information so very sensitive information should not be kept on websites, it is only a way to prevent search engines from crawling into the website. Another important thing is the location of the Robots.txt because search engines don’t search through whole of the website for presence of Robots.txt, so it should be placed in the main directory.
     
    alka007, Feb 23, 2012 IP
  2. manchun.seo

    manchun.seo Greenhorn

    Messages:
    30
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    18
    #2
    You are wrong Robots provide real time safety how to use that is question
     
    manchun.seo, Feb 23, 2012 IP
  3. azharSEO

    azharSEO Active Member

    Messages:
    1,460
    Likes Received:
    7
    Best Answers:
    1
    Trophy Points:
    63
    #3
    It is great when search engines frequently visit your site and index your content but often there are cases when indexing parts of your online content is not what you want.
     
    azharSEO, Feb 23, 2012 IP
  4. udaypal

    udaypal Peon

    Messages:
    201
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #4
    Robots.txt is a text file we put in our site to tell search robots which pages we would like them not to visit and its the way by which we keep some secrets about our site.
     
    udaypal, Feb 24, 2012 IP
  5. kevincook

    kevincook Peon

    Messages:
    67
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #5
    Well you block any page you dont want to index you will get an error in webmaster...
     
    kevincook, Feb 24, 2012 IP
  6. peter_davis

    peter_davis Peon

    Messages:
    26
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #6
    Hi, There is a hidden, relentless force that permeates the web and its billions of web pages and files, unbeknownst to the majority of us sentient beings. I'm talking about search engine crawlers and robots here. Every day hundreds of them go out and scour the web, whether it's Google trying to index the entire web, or a spam bot collecting any email address it could find for less than honorable intentions. As site owners, what little control we have over what robots are allowed to do when they visit our sites exist in a magical little file called "robots.txt." and Robots.txt is a text file we put in our site to tell search robots which pages we would like them not to visit and its the way by which we keep some secrets about our site.
     
    peter_davis, Feb 24, 2012 IP
  7. solarlight

    solarlight Peon

    Messages:
    20
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #7
    Nice information about robort.txt file. I really happy to read all comment in this post.
     
    solarlight, Feb 24, 2012 IP
  8. jeffsmith

    jeffsmith Member

    Messages:
    204
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    26
    #8
    In simple words robort.txt file prevent search engine to index the web pages which the site owner doesn't want to be indexed by search engine bots.
     
    jeffsmith, Feb 24, 2012 IP
  9. peter_davis

    peter_davis Peon

    Messages:
    26
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #9
    Hi, Robots.txt is a file through which you can guide search engines to crawl or not to crawl certain sections of your website. [​IMG]

    Google specifically follows instructions given in this robots.txt file
     
    peter_davis, Feb 27, 2012 IP
  10. murthyseo

    murthyseo Active Member

    Messages:
    159
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    81
    #10
    Robots.txt is text file restrict confidential files or directory from search engines.
     
    murthyseo, Feb 27, 2012 IP
  11. entrecon

    entrecon Peon

    Messages:
    58
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #11
    Some bots do ignore the robots.txt though, so it isn't foolproof. Don't use it as the only way you protect files.
     
    entrecon, Feb 27, 2012 IP
  12. peter_davis

    peter_davis Peon

    Messages:
    26
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #12
    Hi, Robots.txt is a text (not html) file you put on your site to tell search robots which pages you would like them not to visit. Robots.txt is by no means mandatory for search engines but generally search engines obey what they are asked not to do. It is important to clarify that robots.txt is not a way from preventing search engines from crawling your site (i.e. it is not a firewall, or a kind of password protection) and the fact that you put a robots.txt file is something like putting a note “Please, do not enter” on an unlocked door – e.g. you cannot prevent thieves from coming in but the good guys will not open to door and enter. That is why we say that if you have really sen sitive data, it is too naïve to rely on robots.txt to protect it from being indexed and displayed in search results.
     
    peter_davis, Mar 15, 2012 IP
  13. kumarkunal

    kumarkunal Member

    Messages:
    115
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    46
    #13
    Yes Peter is Right these are the Text files which instructs the Crawlers which all pages you don't want them to index.
     
    kumarkunal, Mar 15, 2012 IP
  14. peter_davis

    peter_davis Peon

    Messages:
    26
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #14
    Hi,There is a hidden, relentless force that permeates the web and its billions of web pages and files, unbeknownst to the majority of us sentient beings. I'm talking about search engine crawlers and robots here. Every day hundreds of them go out and scour the web, whether it's Google trying to index the entire web, or a spam bot collecting any email address it could find for less than honorable intentions. As site owners, what little control we have over what robots are allowed to do when they visit our sites exist in a magical little file called "robots.txt."
     
    peter_davis, Mar 30, 2012 IP
  15. arjunchauhan24

    arjunchauhan24 Peon

    Messages:
    36
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #15
    Robots.txt blocks search engine crawlers from crawling certain pages of the website :)
     
    arjunchauhan24, Mar 30, 2012 IP
  16. Vitor Hugo

    Vitor Hugo Peon

    Messages:
    1
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #16
    Robots is a small file that tells whether it should google indexing certain pages of your site
     
    Vitor Hugo, Apr 2, 2012 IP
  17. war_machine

    war_machine Active Member

    Messages:
    1,323
    Likes Received:
    7
    Best Answers:
    1
    Trophy Points:
    53
    #17
    The robots.txt file is a text file that informs search engine crawlers which pages you'd like them NOT to index. For example, if you want to keep them from indexing everything under your private directory, you would include a Disallow: /private/ field. For even more information about robots.txt, check out this guide: A robots.txt File Guide That Won’t Put You to Sleep.
     
    war_machine, Jun 3, 2012 IP
  18. prabhjot.singh

    prabhjot.singh Member

    Messages:
    36
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    40
    #18
    Yes good answer, If any user want to Disallow any File of his website then he can write in notpad (file name should be Robots.txt) Example Disallow: /captcha.php , and for the Folder of his website then he can use Disallow: /classes/ . Note:- In folder time user must place / in the end.
     
    prabhjot.singh, Jun 3, 2012 IP
    Amaizing likes this.
  19. NareshReddy

    NareshReddy Member

    Messages:
    39
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    48
    #19
    You aer complete wrong the way thinking about robots.txt. A robots.txt plays a major role in SEO. It allows you to restrict the access of search engine robots that crawl the web and it can prevent these robots from accessing specific directories and pages.
     
    NareshReddy, Jul 4, 2012 IP
  20. farooque

    farooque Greenhorn

    Messages:
    12
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    11
    #20
    [h=1]Robots.txt only for Search engine for giving information which page index or which one not.[/h]
     
    farooque, Jul 19, 2012 IP