Silly question but just wanted to check what the general consensus is. If I disallow a certain page from a spider like Googlebot (with robots.txt), but I use a PHP include statement to include its contents on another page, will a spider still index that other page? I am assuming it will still be indexed, since a spider cannot "see" server-side instructions like that, but maybe someone out there knows better than me??
I don't see how the bot would be aware since that is a server side action. But i think the only way to know for sure is to check Google's cache of that page.
The include will not be spidered by the bot unless you actually link to the file you are including... include "myfile.php"; PHP: <a href="myfile.php">Link to include file.</a> HTML: Otherwise it won't be spidered and will appear as if there was only one web page on the server. Your page + included page = Your Page (with more content)
i am no expert but i can clearly see that it will not index that "other page". if you still doubt it will index that other page use "no follow" to your link to other page but not necessary. check your robots.txt at google sitemap.
Hi, It is not a matter of consensus. Just logic. Page A will not be indexed; page B will be indexed. Jean-Luc
hmmmmmm. This isa bit of a brain buster.. php includes aren't seen by users. If I include a page within a page, it will just look like a page. Both are indexxed. Not as separate, but as 1, usually the "mother" page. Now, if you exclude a page from googlebot, it won't be indexed by itself, but sinceit is includedand the page isn't seen as a new page, it will be indexed. Its wierd: pageA.php pageB.php (robots.txt exclude) If the spider goes to pageB.php it won't be indexed as a page, but if it included in pageA.php, the content of pageB.php will be indexed, but it will only show as pageA.php. You might get something like this in search result in google: Page A title blah blah Page A content blah blah blah Page B content Blah Blah http://yourdomain.com/pageA.php Hope that made sense