Log in or Sign up

Does Apache alias effect robots.txt

Discussion in 'Apache' started by tsmori, Aug 19, 2010.

tsmori Peon

Messages:

1

Likes Received:

0

Best Answers:

0

Trophy Points:

0

#1

We recently deployed a drupal based site in which drupal is the document root. In order to serve up some non-drupal content, I've had to use aliases so that drupal doesn't try to commandeer everything.

When I updated the robots.txt file, which lives under drupal, it occurred to me that I may have a problem.

The robots.txt file should allow/disallow bots on the basis of web access, shouldn't it? So if I disallow something like /faq, even if /faq is aliased to a location outside of the document root, the bot should not follow, or do I have this wrong some how?

I'm concerned because I'm seeing google bot in places it shouldn't be and I'm wondering if these aliases (of which I have many) are going to allow bots into everything.

tsmori, Aug 19, 2010 IP
tolra Active Member

Messages:

515

Likes Received:

36

Best Answers:

1

Trophy Points:

80

#2

The bots should be reading robots.txt off the root of the site, you can check it's there by loading it in the browser, yourdomain.com/robots.txt, as long as it's there it tells the bots what to ignore under your domain, how the server is setup to provide content for a folder or whatever is irrelevant as the bots simply ask for a URL unless it matches a deny rule in robots.txt.

You do have to rely on a bot obeying the rules.

tolra, Aug 20, 2010 IP

(You must log in or sign up to reply here.)