i logged into google webmaster tools for my new site http://theguitarresource.com and i got a notice that says "URL restricted by robots.txt" For the url "http://theguitarresource.com", which is my home page. I assume this is bad. But here is what my robots.txt file looks like User-agent: * # disallow all files in these directories Disallow: /cgi-bin/ Disallow: /z/j/ Disallow: /z/c/ Disallow: /stats/ Disallow: /dh_ Disallow: /about/ Disallow: /contact/ Disallow: /tag/ Disallow: /wp-admin/ Disallow: /wp-includes/ Disallow: /contact Disallow: /manual Disallow: /manual/* Disallow: /phpmanual/ Disallow: /category/ User-agent: Googlebot # disallow all files ending with these extensions Disallow: /*.php$ Disallow: /*.js$ Disallow: /*.inc$ Disallow: /*.css$ Disallow: /*.gz$ Disallow: /*.wmv$ Disallow: /*.cgi$ Disallow: /*.xhtml$ # disallow all files with ? in url Disallow: /*?* # disable duggmirror User-agent: duggmirror Disallow: / # allow google image bot to search all images User-agent: Googlebot-Image Disallow: Allow: /* # allow adsense bot on entire site User-agent: Mediapartners-Google* Disallow: Allow: /* so i'm not sure what to change or why it says my homepage is restricted. please help!!!
First off, I'd recommend removing the line: Disallow: /*.php$ Everything in WordPress is PHP so that could disallow most pages from being indexed.
This line is a really bad one too Disallow: /category/ All the links on your right navigation menu for your subpages are in the /category/ tree. With the .php line i see what they have done it's not too bad as the URL's are rewritten to just domain.com/subdir/page So they have added the exclude .php to stop domain.com/subdir/page/index.php being indexed. All up it's a pretty sloppy Robots file. I'd remove it all and just exclude crucial things like includes, admin etc then use .htaccess to rewrite the index.php extensions as a 301 to the subfolder. This isn't done, if you go to any subpage then add index.php to the URL the same page will reload with the index.php still in the browser.
I agree with sweetfollow...Its a really sloppy robots file....is that the one that came with wordpress? If it is I am going to pay better attention to it
WP did not come with one, but i got it as an example of the WP site. See the thing is i really don't know anything about a robots.txt file, so i just used the one they provided. anyone know a better place to get one?? Also, i have a feeling they disallowed the /category b/c that way it wont have duplicate content from the category pages vs. the indavidual post page...but i could be wrong..
is this better? User-agent: * Disallow: /wp-content/ Disallow: /wp-admin/ Disallow: /wp-includes/ Disallow: /wp- Disallow: /feed/ Disallow: /trackback/ Disallow: /cgi-bin/ User-agent: Googlebot # disallow all files ending with these extensions Disallow: /*.php$ Disallow: /*.js$ Disallow: /*.inc$ Disallow: /*.css$ Disallow: /*.gz$ Disallow: /*.wmv$ Disallow: /*.cgi$ Disallow: /*.xhtml$ # allow google image bot to search all images User-agent: Googlebot-Image Disallow: Allow: /* # allow adsense bot on entire site User-agent: Mediapartners-Google* Disallow: Allow: /*