What is Robots.txt ?

alka007 Active Member

Messages:: 355

Likes Received:: 4

Best Answers:: 0

Trophy Points:: 90

#1

Sometimes the companies donâ€™t want the spider or search engines to index data that is present on their websites. there can be thousand reasons for this the website may contain data that is sensitive and personal and company doesnâ€™t want that data to be disclosed or you want to exclude images or style sheets to save band width or any other reason for accomplish this these companies inform the search engines to avoid tit and make use of robot meta tags or Robots.txt file.

Robot metatags have there own limitations and they may go unnoticed so mostly Robots.txt file format is used the coding is simple it is a never ending list of user agents and disallowed files and directories. Basically, the syntax is as follows:

User-agent:
Disallow:
User agents are search engines and spiders where as disallow refers to content that should not be exposed to the public. Sometimes statements are also made as
User-agent: *
Disallow: /temp/
Robots.txt file donâ€™t provide real time safety as they provide are no firewall or password protections but merely asks the user not to log to this information how ever the user might or might not try to get access to the information so very sensitive information should not be kept on websites, it is only a way to prevent search engines from crawling into the website. Another important thing is the location of the Robots.txt because search engines donâ€™t search through whole of the website for presence of Robots.txt, so it should be placed in the main directory.

alka007, Feb 23, 2012 IP

manchun.seo Greenhorn

Messages:: 30

Likes Received:: 1

Best Answers:: 0

Trophy Points:: 18

#2

You are wrong Robots provide real time safety how to use that is question

manchun.seo, Feb 23, 2012 IP

azharSEO Active Member

Messages:: 1,468

Likes Received:: 7

Best Answers:: 1

Trophy Points:: 88

#3

It is great when search engines frequently visit your site and index your content but often there are cases when indexing parts of your online content is not what you want.

azharSEO, Feb 23, 2012 IP

udaypal Peon

Messages:: 201

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 0

#4

Robots.txt is a text file we put in our site to tell search robots which pages we would like them not to visit and its the way by which we keep some secrets about our site.

udaypal, Feb 24, 2012 IP

kevincook Peon

Messages:: 67

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 0

#5

Well you block any page you dont want to index you will get an error in webmaster...

kevincook, Feb 24, 2012 IP

peter_davis Peon

Messages:: 25

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 0

#6

Hi, There is a hidden, relentless force that permeates the web and its billions of web pages and files, unbeknownst to the majority of us sentient beings. I'm talking about search engine crawlers and robots here. Every day hundreds of them go out and scour the web, whether it's Google trying to index the entire web, or a spam bot collecting any email address it could find for less than honorable intentions. As site owners, what little control we have over what robots are allowed to do when they visit our sites exist in a magical little file called "robots.txt." and Robots.txt is a text file we put in our site to tell search robots which pages we would like them not to visit and its the way by which we keep some secrets about our site.

peter_davis, Feb 24, 2012 IP

solarlight Peon

Messages:: 20

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 0

#7

Nice information about robort.txt file. I really happy to read all comment in this post.

solarlight, Feb 24, 2012 IP

jeffsmith Member

Messages:: 203

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 26

#8

In simple words robort.txt file prevent search engine to index the web pages which the site owner doesn't want to be indexed by search engine bots.

jeffsmith, Feb 24, 2012 IP

peter_davis Peon

Messages:: 25

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 0

#9

Hi, Robots.txt is a file through which you can guide search engines to crawl or not to crawl certain sections of your website.

Google specifically follows instructions given in this robots.txt file

peter_davis, Feb 27, 2012 IP

murthyseo Active Member

Messages:: 159

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 81

#10

Robots.txt is text file restrict confidential files or directory from search engines.

murthyseo, Feb 27, 2012 IP

entrecon Peon

Messages:: 57

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 0

#11

Some bots do ignore the robots.txt though, so it isn't foolproof. Don't use it as the only way you protect files.

entrecon, Feb 27, 2012 IP

peter_davis Peon

Messages:: 25

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 0

#12

Hi, Robots.txt is a text (not html) file you put on your site to tell search robots which pages you would like them not to visit. Robots.txt is by no means mandatory for search engines but generally search engines obey what they are asked not to do. It is important to clarify that robots.txt is not a way from preventing search engines from crawling your site (i.e. it is not a firewall, or a kind of password protection) and the fact that you put a robots.txt file is something like putting a note â€œPlease, do not enterâ€ on an unlocked door â€“ e.g. you cannot prevent thieves from coming in but the good guys will not open to door and enter. That is why we say that if you have really sen sitive data, it is too naÃ¯ve to rely on robots.txt to protect it from being indexed and displayed in search results.

peter_davis, Mar 15, 2012 IP

kumarkunal Member

Messages:: 115

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 46

#13

Yes Peter is Right these are the Text files which instructs the Crawlers which all pages you don't want them to index.

kumarkunal, Mar 15, 2012 IP

peter_davis Peon

Messages:: 25

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 0

#14

Hi,There is a hidden, relentless force that permeates the web and its billions of web pages and files, unbeknownst to the majority of us sentient beings. I'm talking about search engine crawlers and robots here. Every day hundreds of them go out and scour the web, whether it's Google trying to index the entire web, or a spam bot collecting any email address it could find for less than honorable intentions. As site owners, what little control we have over what robots are allowed to do when they visit our sites exist in a magical little file called "robots.txt."

peter_davis, Mar 30, 2012 IP

arjunchauhan24 Peon

Messages:: 36

Likes Received:: 1

Best Answers:: 0

Trophy Points:: 0

#15

Robots.txt blocks search engine crawlers from crawling certain pages of the website

arjunchauhan24, Mar 30, 2012 IP

Vitor Hugo Peon

Messages:: 1

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 0

#16

Robots is a small file that tells whether it should google indexing certain pages of your site

Vitor Hugo, Apr 2, 2012 IP

war_machine Active Member

Messages:: 1,319

Likes Received:: 7

Best Answers:: 1

Trophy Points:: 53

#17

The robots.txt file is a text file that informs search engine crawlers which pages you'd like them NOT to index. For example, if you want to keep them from indexing everything under your private directory, you would include a Disallow: /private/ field. For even more information about robots.txt, check out this guide: A robots.txt File Guide That Wonâ€™t Put You to Sleep.

war_machine, Jun 3, 2012 IP

prabhjot.singh Active Member

Messages:: 36

Likes Received:: 1

Best Answers:: 0

Trophy Points:: 65

#18

war_machine said: ↑

The robots.txt file is a text file that informs search engine crawlers which pages you'd like them NOT to index. For example, if you want to keep them from indexing everything under your private directory, you would include a Disallow: /private/ field. For even more information about robots.txt, check out this guide: A robots.txt File Guide That Wonâ€™t Put You to Sleep.
Click to expand...

Yes good answer, If any user want to Disallow any File of his website then he can write in notpad (file name should be Robots.txt) Example Disallow: /captcha.php , and for the Folder of his website then he can use Disallow: /classes/ . Note:- In folder time user must place / in the end.

prabhjot.singh, Jun 3, 2012 IP

Amaizing likes this.

NareshReddy Member

Messages:: 39

Likes Received:: 1

Best Answers:: 0

Trophy Points:: 48

#19

You aer complete wrong the way thinking about robots.txt. A robots.txt plays a major role in SEO. It allows you to restrict the access of search engine robots that crawl the web and it can prevent these robots from accessing specific directories and pages.

NareshReddy, Jul 4, 2012 IP

farooque Greenhorn

Messages:: 12

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 11

#20

[h=1]Robots.txt only for Search engine for giving information which page index or which one not.[/h]

farooque, Jul 19, 2012 IP

Log in or Sign up

What is Robots.txt ?

alka007 Active Member

manchun.seo Greenhorn

azharSEO Active Member

udaypal Peon

kevincook Peon

peter_davis Peon

solarlight Peon

jeffsmith Member

peter_davis Peon

murthyseo Active Member

entrecon Peon

peter_davis Peon

kumarkunal Member

peter_davis Peon

arjunchauhan24 Peon

Vitor Hugo Peon

war_machine Active Member

prabhjot.singh Active Member

NareshReddy Member

farooque Greenhorn

Useful Searches