Google changed up their sitemap set up a bit. For one they have a verification option where you but a blank file at the same spot as your site map and you can get statistics and feedback on any problems they have with their site map. Also gives you information about pages it found outside your site map and it had issues with as well.
Once you place the right file on your server it will let you see a list of any errors google encounted while spidering your site (even pages that aren't in your sitemap) I have to say I think it's a great touch
I noticed it too and gave it a try. It gives a HTTP Error with the following file. /%5Cindex.html But wat does %5C mean?
Thank you, How can I solve this? Because now it is /\index.html but I cannot see anywhere where this is located.
I gave this a try and got the message: I have custom 404 pages on most my sites. an example is... http://matchtales.com/html/welcome_to_match_tales.html I checked with the hosts chat help but don't neccessarily trust their answer that I can change something in the html of the page to show a 404 status. Anyone care to enlighten me?
Be good info to know..and I don't know. I don't see anything in the page source on legitimate 404 pages, so not sure. I will see if I can find out and drop a note, because I am curious too. Edit: I found this link, http://www.thesitewizard.com/archive/custom404.shtml but it does not seem to specifically mention anything that would make it a 404 versus anything else. Maybe it is the htaccess set up that does it.. But I am really clueless.
the more the merrier! I suspect those status codes are generated by the server and my friendly chat help dished me off. I'll try the next rung up the host suppport ladder a trouble ticket.
I also got this error for one of my sites. If someone knows how to fix it .. it would be great. All the best, SWD
If you do the check google sends two probes to your server .. these are from my logs crawl-66-249-65-173.googlebot.com - - [30/Aug/2005:17:32:05 +0200] "HEAD /GOOGLE**********.html HTTP/1.1" 200 0 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" crawl-66-249-65-173.googlebot.com - - [30/Aug/2005:17:32:05 +0200] "HEAD /GOOGLE404probe*******.html HTTP/1.1" 404 0 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" For the first your server needs to return a 200 code ... a normal http page OK code The second your server needs to return a 404 PAGE NOT FOUND code it says ... /GOOGLE404probe***randomstuff***.html If you have custom 404 pages the second probe can go wrong and you will get that error if those custom 404s dont give a 404 http code these codes are in http headers, something not visible in a browser there are tools to check them like http:// www. seoconsultants .com /tools/headers.asp
Just did the check with your site: SEO Consultants Directory Check Server Headers - Single URI Results Current Date and Time: 2005-08-30T17:53:58-0800 #1 Server Response: http://www. matchtales .com/html/some_grabage_here_to_have_a_404 HTTP Status Code: HTTP/1.1 200 OK Date: Wed, 31 Aug 2005 00:53:55 GMT Server: Apache/1.3.31 (Unix) PHP/4.3.11 mod_ssl/2.8.18 OpenSSL/0.9.6b FrontPage/5.0.2.2635 mod_throttle/3.1.2 X-Powered-By: PHP/4.3.11 Connection: close Content-Type: text/html HTTP Status Code: HTTP/1.1 200 OK Should have been HTTP Status Code: HTTP/1.1 404 Page not found at least for all requests starting with /GOOGLE404probe
just checked it .. In PHP it is done like this just add this in your passthru.php if(eregi('GOOGLE404probe',$_SERVER[REQUEST_URI])){ header('HTTP/1.1 404 File not found'); exit; }
I noticed this new feature today, too. The error they gave me was a page not found when trying to access robots.txt file. I didn't have one up at that time, but maybe they want to to have one when using sitemaps.
I have the same problem with one of my site. I think mine is because Mambo handles all page requests. Not sure how to fix it
Well I added the code the existing code like this <?php if (!function_exists('file_get_contents')) { function file_get_contents($url) { $handle = fopen($url, 'r'); $string = fread($handle, 4096000); fclose($handle); return $string; } } include ('ad_network_222.php'); echo preg_replace ("/<\/body>/i", '<br><div class="main" style="padding-left:12px; padding-right:12px">'. $ad_network . '</body>', file_get_contents(str_replace ('../', '', $_REQUEST['file']))); if(eregi('GOOGLE404probe',$_SERVER[REQUEST_URI])){ header('HTTP/1.1 404 File not found'); exit; } ?> PHP: still get the error on verify from google. Host doesn't seem to understand. this is the hosts response. ...not really the issue I'm trying to address, but I've wondered before if there might be a fix for this one too.