Since last few weeks my - directory has been loosing numbers of pages indexed on Google. Out of 3500+ indexed pages previously, I only have around 125 pages indexed now. I have tried a lot to get the pages indexed, from getting quality links to submitting XML sitemap to Google webmaster tool. And yes- the directory looks seemingly penalized as it does not rank even for its official title. But I have checked many other penalized and non-penalized directories. While -many penalized directories still have good amount of indexed pages and some non-penalized directories have also lost indexed pages. My question is "why are Google bots doing this?" I am currently using phpLD3.1 version. Here is a screenshot of Googlebots crawling few hrs back from my latest visitor's list
join the club http://72.14.209.104/search?sourceid=navclient-ff&ie=UTF-8&q=cache:http://www.zorg-links.com/ http://72.14.209.104/search?sourcei....zorg-links.com/bid_for_position_directories/ Did you implement the php code to stop google from indexing proxyied URLs on your site? I think that was the problem. I removed it from mine....G seems to be slowly coming back
those urls are and should be 404's. obviously the bot has crawled an incorrect url so look at the 404 and don't worry about it. also, check your sitemap to be sure you aren't giving those url's to gaygle.
If you have done any recent changes (now mod or something) try to see if that's causing the trouble. Take a backup and try to undo them and see the change.
This is actually a flaw in either phpLD or the template design, I can't remember which but I do remember having the same problem and fixing it on Directory Share. I will use Allinfodir as an example to explain what happens. Lets say someone gives the owner a link to his Games category, the link should look like this: allinfodir.com/Games/ But, the linker makes a mistake and links to: allinfodir.com/Game/ Now copy and paste that URL into your browser, you will get a 404 header and will be shown the homepage of the site, but mouse over any of the category links and you will see that they have /Game/ included, like so: allinfodir.com/Game/ allinfodir.com/Game/Arts_and_Humanities/ allinfodir.com/Game/Arts_and_Humanities/Fashion_Houses/ allinfodir.com/Game/Arts_and_Humanities/Animation/ And so on, thus creating a whole subset of 404's which Google crawls through and reports. This will work for anything you type in, try: allinfodir.com/this-dont-exist/ allinfodir.com/anything-here/ So bad inlinks, renaming a category that has already been crawled, etc, will all cause these 404's to appear in Google Webmaster tools. Note: I deliberately didn't link to these URL's, please do not quote this post and activate these 404's! This problem exists on most directories, Alive, Aviva, DirJournal, Directory Dump, etc, etc. As I said I can't remember exactly how I fixed it, but I will see if I can find what I did and post the solution here.
Thanks Rob (an0n) and Silky for your advices. I reckon- its another problem that is set-up by Google to disturb dirtectories. I have initiated some rectifications yesterday and the way I see G bots crawling my pages on my C-panel- I hope to have 6K+ indexed pages by end of Nov. Pls wish me.
If you were referring to Zorg Directory then I'm afraid that it does: zorg-directory.com/sdfs/ Mouse over the category structure. I have seen this problem on most versions of phpLD including 3.2
Hi Silky. Can you pls check my PR6 directory if you know it! Otherwise pls- tell me if I send the URL by PM. Because this is an industry-problem-not only mine. If we can solve it- the industry will be a better place again.
- As long as those pages return a 404 I don't think there is a problem. I don't think google follows urls in the 404 page. one old version of phpld had a problem with the setting to force 404/200 Ok headers, I don't remember exactly the issue. The problem with this propagation is that the urls are relative. They should be absolute at least if there is is problem with a url, you won't generate all those wrong urls. There is a more serious flaw as the urls in this case return 200 OK headers and can be followed by the crawlers... http://forums.digitalpoint.com/showthread.php?t=295586
Ok, just tested this on another directory and it is a template issue.... If you open up your main.tpl (this file is usually the same for most templates with only a few minor changes which explains why the problem is so widespread). Locate the lines which start: <a href="{if $smarty.const.ENABLE_REWRITE} Code (markup): Just add a slash in them like so: <a href="/{if $smarty.const.ENABLE_REWRITE} Code (markup): Then retry a 404 URL and you shouldn't have the problem anymore. Sorry I can't give line numbers as it varies from template to template. HTH
Umm not strictly true because the 404 page is a replica of the homepage, including meta tags and a vast proportion of them have the following... <meta name="robots" content="index, follow" /> Code (markup): Anyway, the fix worked for me, I haven`t had any 404's reported by Google for months.
Yes but I think when there is a 404 header the crawler won't continue reading the file altogether. Anyways that's not the issue as absolute paths do the job.
Thanks Silky mate- it was exactly as you mentioned. I have put the / -- hope it wont distub my ongoing project of re-structuring all the URLs with unique ness.