How to restrict url from robots.txt file

OSSEO Active Member

Messages:: 1,430

Likes Received:: 6

Best Answers:: 0

Trophy Points:: 53

#1

I want to know how to restrict link from robots txt file, I want to restrict below url . We did not made sub domain but sina.com.cn using our URL, I have checked from domain & hosting panel there is no any file and sub domain, How to restrict and fix it.

Moreover anybody know why sina.com.cn using other website url or how we can protect our website by using robots.txt file or other way.

https://tool.mykidslunchbox.com.au/forgot-password.aspx
http://www.sina.com.cn.mykidslunchbox.com.au/forgot-password.aspx
https://www.sina.com.cn.mykidslunchbox.com.au/how-it-works.aspx
https://www.sina.com.cn.mykidslunchbox.com.au/contactus.aspx
http://tool.mykidslunchbox.com.au/contactus.aspx
http://tool.mykidslunchbox.com.au/benefits.aspx
http://tool.mykidslunchbox.com.au/

OSSEO, Sep 7, 2011 IP

Icecube_media Peon

Messages:: 656

Likes Received:: 3

Best Answers:: 0

Trophy Points:: 0

#2

Hi,

Disallow: /index.php or sub page. I hope it helps

Icecube_media, Sep 7, 2011 IP

OSSEO Active Member

Messages:: 1,430

Likes Received:: 6

Best Answers:: 0

Trophy Points:: 53

#3

suppose i want to stop below url for indexing. is it ok

Disallow: /tool.mykidslunchbox.com.au/forgot-password.aspx
Disallow: /sina.com.cn.mykidslunchbox.com.au/forgot-password.aspx

m i right ?

OSSEO, Sep 7, 2011 IP

pickledegg Greenhorn

Messages:: 34

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 16

#4

This article as it all bud, I think you're right with the above example, but for files within a whole directory you can just go:

http://www.free-seo-news.com/all-about-robots-txt.htm

Disallow: /tool.mykidslunchbox.com.au/somedirectory

pickledegg, Sep 8, 2011 IP

OSSEO Active Member

Messages:: 1,430

Likes Received:: 6

Best Answers:: 0

Trophy Points:: 53

#5

This is really wonderful article and very helpful for me, but in this article author mention that you cant use "ALLOW" word but when i check Google.com/robots.txt file ,

They are using why ? check this article line.

" Don't use an "Allow" command in your robots.txt file. Only mention files and directories that you don't want to be indexed. All other files will be indexed automatically if they are linked on your site."

OSSEO, Sep 8, 2011 IP

Sonam Singh Peon

Messages:: 14

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 0

#6

please use Robot file with your site so that no search Engine crawling done for the next few of the week and then you need to verify so the error will not be so longer there.

Sonam Singh, Sep 8, 2011 IP

christopherscott Peon

Messages:: 32

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 0

#7

Use robot.txt file and disallow that files you don't want to being indexed.

christopherscott, Sep 15, 2011 IP

jabz.biz Active Member

Messages:: 384

Likes Received:: 6

Best Answers:: 1

Trophy Points:: 70

#8

"allow" is a nonstandard extension of the protocol. Please use robots.txt only to disallow crawler access.

User-agent: *
Disallow: /

equals

User-agent: *
allow: /

whilst "allow" is not part of the robots exclusion standard (robots.txt)

I have collected a full set of example implementations here: http://rield.com/cheat-sheets/robots-exclusion-standard-protocol

jabz.biz, Sep 26, 2011 IP

manig4 Member

Messages:: 107

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 26

#9

Is it possible in wordpress?

manig4, Sep 27, 2011 IP

amherstsowell Peon

Messages:: 261

Likes Received:: 2

Best Answers:: 0

Trophy Points:: 0

#10

Disallow internal page.

amherstsowell, Sep 29, 2011 IP

seoguys04 Greenhorn

Messages:: 49

Likes Received:: 1

Best Answers:: 0

Trophy Points:: 18

#11

jabz.biz explained ti perfectly. But you do not need to put an Allow directive in the robots.txt file. Its not a part of exclusion standard. Robots.txt has never helped webmasters in achieving good ranks. This file is used to restrict robots from crawling a part of whole of the website. Remember there are bad robots as well, which do not always follow the directives of robots.txt. In this case using robots,txt does NOT mean a security system too. It is always better to password protect the folders and directories you do not want to be crawled. Anyways this is not what you have inquired of. Please keep in mind that robots..txt has got nothing to do with in ranking on SERP.
Cheers!!!

seoguys04, Oct 7, 2011 IP

MyWebsiteNow Peon

Messages:: 71

Likes Received:: 1

Best Answers:: 0

Trophy Points:: 0

#12

In order to use a robots.txt file, you'll need to have access to the root of your domain (if you're not sure, check with your web hoster). If you don't have access to the root of a domain, you can restrict access using the robots meta tag.

MyWebsiteNow, Oct 17, 2011 IP

Log in or Sign up

How to restrict url from robots.txt file

OSSEO Active Member

Icecube_media Peon

OSSEO Active Member

pickledegg Greenhorn

OSSEO Active Member

Sonam Singh Peon

christopherscott Peon

jabz.biz Active Member

manig4 Member

amherstsowell Peon

seoguys04 Greenhorn

MyWebsiteNow Peon

Useful Searches