Ok I have setup a WP blog on a domain I purchased that seems to have been parked in the past. Google has indexed urls such as www.domain.com/?q=somerandomcategory and these pages that are getting indexed all load up index.php...like if you have mydomain.com/?q=test it will still show the home page so I think this is causing a duplicate penalty. I'm wondering if I should disallow those urls in htaccess, if it can be done or would this be ok in robots.txt: Disallow: /?q=*
According to google you can add this line to your robots.txt to keep google from indexing all pages on your domain with a ? in the url Disallow: /*?* Code (markup):
Heres the robots.txt file that I use on my wordpress blog User-agent: * Disallow: /cgi-bin/ Disallow: /wp-admin/ Disallow: /wp-content/ Disallow: /wp-includes/ Disallow: /wp-admin/ Disallow: /page/ Disallow: /comments/ Disallow: /wp- Disallow: /xmlrpc.php Disallow: /contact.php Disallow: /category/ Disallow: /?p Disallow: /about/trackback/ Disallow: /wp-register.php Disallow: /wp-login.php Disallow: /*?*
Heres the robots.txt file that I use on my wordpress blog User-agent: * Disallow: /cgi-bin/ Disallow: /wp-admin/ Disallow: /wp-content/ Disallow: /wp-includes/ Disallow: /wp-admin/ Disallow: /page/ Disallow: /comments/ Disallow: /wp- Disallow: /xmlrpc.php Disallow: /contact.php Disallow: /index.php Disallow: /category/ Disallow: /about/trackback/ Disallow: /wp-register.php Disallow: /wp-login.php Disallow: /*?* Code (markup):
Thanks...2 questions though Wouldn't Disallow: /category/ disallow all urls with "category" in them eg... domain.com/category/randompost/. If so why would you want to do that? and disallowing index.php is basically so that domain.com and domain.com/index.php is not a duplicate?