View Full Version : Robots.txt question
I have a site www.widgets.com
www.site2.com is actaully parked at http://www.widgets.com/site2 but is a domain in its own right.
So i have a domain parked on a subfolder of my widgets.com account
(im confusing myself)
I dont want google to index "http://www.widgets.com/site2" so can I use Disallow: /site2
in my robots.txt file without affecting www.site2.com?
Hope that makes sense...
Smyrl
Jun 5th 2005, 6:40 am
Yes, do a Google search for Robots.txt tutorial and follow directions. Would tell you what to do but if I made mistake do not want you mad at me.
Here is a tutorial and validator.
http://www.searchengineworld.com/robots/robots_tutorial.htm
http://www.searchengineworld.com/cgi-bin/robotcheck.cgi
Remember not all bot are obedient creatures.
Good luck.
Shannon
no worries shannon
I understand robots.txt syntax its just becuase i have a second domain parked on a subfolder of the first that im worried :)
Smyrl
Jun 5th 2005, 7:11 am
I did similar thing when a client needed a one page web up the next day in time for press release. I created a page in root directory of web he owned and pointed second domain to newly created page in root directory of his existing web. Things rocked along nicely for a couple of years until owner of the parked domain gave the domain name to an artist who linked to page using second domain name. We were then hit with duplicate content penalty.
Shannon
J.D.
Jun 5th 2005, 7:52 am
If I understand you correctly, you have this configuration:
website1.com > /physical/path/webroot
website2.com > /physical/path/webroot/site2/
There are two ways to access website2.com (as http: //website2.com/ and as http: //website1.com/site2/), and you will need to protect it twice. If you want to use robots.txt for this, you'd need two of them - one in the root of website1.com for the /site2/ directory and one in the root of website2.com.
I wouldn't use robots.txt for this, though (because it reveals the protected path). If it's a temporary location that you use just for testing, use htaccess to return 404 (not found) for eveybody, except you (say, using the IP address range or user agent).
J.D.
THT
Jun 5th 2005, 10:33 am
yeah jd thats right.
But i dont want to hide it, just dont want google to index http: //website1.com/site2/
i want them to index http: //website2.com
so it robots.txt ok for that?
minstrel
Jun 5th 2005, 2:48 pm
Are you on a Nix server, THT?
If so, use a redirect in your htaccess file so that all requests for http: //website1.com/site2/ are sent to http: //website2.com -- except that I'd recommend you use http: //www.website2.com rather than http: //website2.com
yeah i am using the www. version
what would be the syntax for this?
minstrel
Jun 5th 2005, 3:25 pm
Try this:
Redirect 301 /site2 http://www.website2/
minstrel
Jun 5th 2005, 3:26 pm
Note:
The .htaccess file MUST be placed in the root directory of www.website1.com -- i.e., at http://www.website1.com/.htaccess
J.D.
Jun 5th 2005, 4:44 pm
yeah jd thats right.
But i dont want to hide it, just dont want google to index http: //website1.com/site2/
i want them to index http: //website2.com
so it robots.txt ok for that?Then hide the directory /site2 with something like htaccess (you can do a similar thing with IIS through configuration). There's no need to redirect /site2 to the second website, since SEs will be able to access the second site through the domain name. Like I said, I wouldn't use robots.txt for this. Both websites will behave as if they are independent. In fact, unless you share some code between these websites, I would place the second website in a separate directory (thinking that you would have to add a virtual website anyway, it shouldn't be a problem to create another directory on this machine.
J.D.
apart from im on a *nix server as previously discussed, so no IIS
ZuraX
Jun 6th 2005, 2:14 am
Would it be wise to also use this 301 redirect for dir's for subdomains?
vBulletin® v3.6.8, Copyright ©2000-2008, Jelsoft Enterprises Ltd.