Hey DP, I was wondering if someone could post their ROBOTS.TXT file or give me an idea what is the best robots file we could use for our wordpress installed blogs. This is what I use and wanted to know if someone could give me any tips or suggestions or what? Here is what I used: User-agent: * Disallow: /wp- Disallow: /feed/ Disallow: /trackback/ Disallow: /rss/ Disallow: /comments/feed/ Disallow: /comments/ Disallow: /category/* Disallow: /tag/* But when i do site:www.domain.com in google it shows both my tags and categories and I dont want that .. really no reason for those to be indexed so please let me know what to do or any other useful tips or suggestions. Also it indexed my archives like domain.com/2008/09 .. any idea how to prevent this?
Mine goes like this.... # This rule means it applies to all user-agents User-agent: * Disallow: /wp-content/ Disallow: /wp-admin/ Disallow: /wp-includes/ Disallow: /wp- Disallow: /trackback/ Disallow: /cgi-bin/ # Disallow all monthly archive pages Disallow: /2005/0 Disallow: /2005/1 Disallow: /2006/0 Disallow: /2006/1 Disallow: /2007/0 Disallow: /2007/1 # The Googlebot is the main search bot for google User-agent: Googlebot # Disallow all files ending with these extensions Disallow: /*.php$ Disallow: /*.js$ Disallow: /*.inc$ Disallow: /*.css$ Disallow: /*.gz$ Disallow: /*.wmv$ Disallow: /*.tar$ Disallow: /*.tgz$ Disallow: /*.cgi$ Disallow: /*.xhtml$ # Disallow Google from parsing indididual post feeds and trackbacks.. Disallow: */feed/ Disallow: */trackback/ # Disallow all files with ? in url Disallow: /*?* Disallow: /*? # Disallow all archived monthlies Disallow: /2006/0* Disallow: /2007/0* Disallow: /2006/1* Disallow: /2007/1* # The Googlebot-Image is the image bot for google User-agent: Googlebot-Image # Allow Everything Allow: /* # This is the ad bot for google User-agent: Mediapartners-Google* # Allow Everything Allow: /* Code (markup): Hope this is useful.
I am using permalinks in my wordpress blog reviewgooglechrome.com. If Disallow: /category/* is added to robots.txt will it block all my posts ? because all posts fall under the category folder.
Sitemap: http://www.yoursite.com/sitemap.xml User-agent: * Disallow: /wp-content/ Disallow: /wp-icludes/ Disallow: /trackback/ Disallow: /wp-admin/ Disallow: /archives/ Disallow: /category/ Disallow: /tag/ Disallow: /*.js$ Disallow: /*.inc$ Disallow: /*.css$ Disallow: /*.php$ User-agent: All Allow: / User-agent: Googlebot-Image Disallow: / User-agent: ia_archiver Disallow: / User-agent: duggmirror Disallow: / This Worked Perfect For me
no buddy not like that if you want google not to find it you have to add no follow link before it like this <meta name="robots" content="noindex, nofollow" />
it affect really bad google bot considers duplicate content they may have de index your site whenever they want so why to risk
here is a new optimized robots.txt for wordpress blogs Sitemap: http://your-site.com/sitemap.xml User-agent: * Disallow: /wp-content/ Disallow: /wp-icludes/ Disallow: /cgi-bin Disallow: /cgi-bin/ Disallow: /trackback/ Disallow: /wp-admin/ Disallow: /archives/ Disallow: /category/ Disallow: /category/*/* Disallow: /tag/* Disallow: /tag/ Disallow: /wp-* Disallow: /login/ Disallow: */trackback Disallow: /*.js$ Disallow: /*.inc$ Disallow: /*.css$ Disallow: /*.php$ Disallow: /*?* Disallow: /*? Disallow: /20* User-agent: All Allow: / User-agent: Googlebot-Image Disallow: / User-agent: ia_archiver Disallow: / User-agent: duggmirror Disallow: /
hi we are also getting Duplicate content plz help me fixing this we are using wordpress and getting two type of urls Samples /samsung-has-launched-st5000-and-st5500-digital-pocket-cameras /samsung-has-launched-st5000-and-st5500-digital-pocket-cameras/-0346 reason may be that in start we are with www prefix and after wordpress used default prefix as without www and didn't noticed this and get duplicated maybe ;and know we are using this /samsung-has-launched-st5000-and-st5500-digital-pocket-cameras/-0346 2 problem is that we are getting our blog title in the end of each url http://xyz.com/sony-ericsson-new-concept-phone-with-pivotal-point/surfpk | All about Apple, Microsoft, Gadgets and more. bold one is our blog title and coming at the end of url and giving 404 error in webmaster forum plz guide me how to fix this.