1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

Robots.txt question

Discussion in 'robots.txt' started by THT, Jun 5, 2005.

  1. #1
    I have a site www.widgets.com

    www.site2.com is actaully parked at http://www.widgets.com/site2 but is a domain in its own right.

    So i have a domain parked on a subfolder of my widgets.com account

    (im confusing myself)

    I dont want google to index "http://www.widgets.com/site2" so can I use Disallow: /site2
    in my robots.txt file without affecting www.site2.com?

    Hope that makes sense...
     
    THT, Jun 5, 2005 IP
  2. Smyrl

    Smyrl Tomato Republic Staff

    Messages:
    13,740
    Likes Received:
    1,702
    Best Answers:
    78
    Trophy Points:
    510
    #2
    Smyrl, Jun 5, 2005 IP
  3. THT

    THT Peon

    Messages:
    686
    Likes Received:
    8
    Best Answers:
    0
    Trophy Points:
    0
    #3
    no worries shannon

    I understand robots.txt syntax its just becuase i have a second domain parked on a subfolder of the first that im worried :)
     
    THT, Jun 5, 2005 IP
  4. Smyrl

    Smyrl Tomato Republic Staff

    Messages:
    13,740
    Likes Received:
    1,702
    Best Answers:
    78
    Trophy Points:
    510
    #4
    I did similar thing when a client needed a one page web up the next day in time for press release. I created a page in root directory of web he owned and pointed second domain to newly created page in root directory of his existing web. Things rocked along nicely for a couple of years until owner of the parked domain gave the domain name to an artist who linked to page using second domain name. We were then hit with duplicate content penalty.

    Shannon
     
    Smyrl, Jun 5, 2005 IP
  5. J.D.

    J.D. Peon

    Messages:
    1,198
    Likes Received:
    65
    Best Answers:
    0
    Trophy Points:
    0
    #5
    If I understand you correctly, you have this configuration:

    website1.com > /physical/path/webroot
    website2.com > /physical/path/webroot/site2/

    There are two ways to access website2.com (as http: //website2.com/ and as http: //website1.com/site2/), and you will need to protect it twice. If you want to use robots.txt for this, you'd need two of them - one in the root of website1.com for the /site2/ directory and one in the root of website2.com.

    I wouldn't use robots.txt for this, though (because it reveals the protected path). If it's a temporary location that you use just for testing, use htaccess to return 404 (not found) for eveybody, except you (say, using the IP address range or user agent).

    J.D.
     
    J.D., Jun 5, 2005 IP
  6. THT

    THT Peon

    Messages:
    686
    Likes Received:
    8
    Best Answers:
    0
    Trophy Points:
    0
    #6
    yeah jd thats right.

    But i dont want to hide it, just dont want google to index http: //website1.com/site2/

    i want them to index http: //website2.com

    so it robots.txt ok for that?
     
    THT, Jun 5, 2005 IP
  7. minstrel

    minstrel Illustrious Member

    Messages:
    15,082
    Likes Received:
    1,243
    Best Answers:
    0
    Trophy Points:
    480
    #7
    Are you on a Nix server, THT?

    If so, use a redirect in your htaccess file so that all requests for http: //website1.com/site2/ are sent to http: //website2.com -- except that I'd recommend you use http: //www.website2.com rather than http: //website2.com
     
    minstrel, Jun 5, 2005 IP
  8. THT

    THT Peon

    Messages:
    686
    Likes Received:
    8
    Best Answers:
    0
    Trophy Points:
    0
    #8
    yeah i am using the www. version

    what would be the syntax for this?
     
    THT, Jun 5, 2005 IP
  9. minstrel

    minstrel Illustrious Member

    Messages:
    15,082
    Likes Received:
    1,243
    Best Answers:
    0
    Trophy Points:
    480
    #9
    minstrel, Jun 5, 2005 IP
  10. minstrel

    minstrel Illustrious Member

    Messages:
    15,082
    Likes Received:
    1,243
    Best Answers:
    0
    Trophy Points:
    480
    #10
    minstrel, Jun 5, 2005 IP
  11. J.D.

    J.D. Peon

    Messages:
    1,198
    Likes Received:
    65
    Best Answers:
    0
    Trophy Points:
    0
    #11
    Then hide the directory /site2 with something like htaccess (you can do a similar thing with IIS through configuration). There's no need to redirect /site2 to the second website, since SEs will be able to access the second site through the domain name. Like I said, I wouldn't use robots.txt for this. Both websites will behave as if they are independent. In fact, unless you share some code between these websites, I would place the second website in a separate directory (thinking that you would have to add a virtual website anyway, it shouldn't be a problem to create another directory on this machine.

    J.D.
     
    J.D., Jun 5, 2005 IP
  12. THT

    THT Peon

    Messages:
    686
    Likes Received:
    8
    Best Answers:
    0
    Trophy Points:
    0
    #12
    apart from im on a *nix server as previously discussed, so no IIS
     
    THT, Jun 6, 2005 IP
  13. ZuraX

    ZuraX Active Member

    Messages:
    156
    Likes Received:
    7
    Best Answers:
    0
    Trophy Points:
    58
    #13
    Would it be wise to also use this 301 redirect for dir's for subdomains?
     
    ZuraX, Jun 6, 2005 IP