I had never build any site from wordpress platform. But, now i am in a project and wordpress platform and my own hosting server is the best combo. I am doing good. The site i am building is not more than a week old. My problem is not of being not indexed by google but problem is it has been indexed more than it needs. If i create a page or post, it gets indexed within 3-5 hours of being posted, this is awesome, is't it? But, i dont know how wordpress SEO plugins are working as all my tags and category pages are also indexed having the same content. This create a duplicate content which google doesn't like. And i think due to this, value of my content will generally decrease comparing to my competitors. So, to solve this, i restrict some contents to be indexed in robots.txt. You can have a look of my robots.txt below. And i had removed the tags and category pages from google index with the help of remove URL and it has also worked as i wanted. But, now webmaster tool shows bad health of my site. It mentions like "Some important pages have been removed by request" and "Some important pages have been blocked by robots.txt". Guys is this normal? i haven't seen such messages before as i am doing marketing for long. Is there any error in my robots.txt file . Need urgent suggestion!!! # Added by Link Alias Generator (LAG) module User-agent: * Disallow: /go/ # End LAG User-agent: * Disallow: /wp-admin/ Disallow: /wp-includes/ Disallow: /cgi-bin Disallow: /wp-admin Disallow: /wp-includes Disallow: /wp-content/plugins Disallow: /wp-content/cache Disallow: /wp-content/themes Disallow: /wp-login.php Disallow: /*wp-login.php* Disallow: /comments Disallow: /author Disallow: /contact/ Disallow: */comments Disallow: /category/* Disallow: /category/ Disallow: /login/ Disallow: /tag/ Disallow: /tag/*
Yeah... don't screw with it, Google will sort this out by itself, if you want to make the pages unique install a plugin that lets you add descriptions to the tags and categories and taxonomies... other than that dude - it's completely normal for your site to be indexed as such... just let Google dance on your site and I wouldn't block anything or any robots/spiders. Google wont penalize you for it - it will just remove the content at it's own pace and as it see's fit - for now they just count as extra links to your other pages and posts.
Thanks for this reply buddy. So won't it effect the ranking in SERPS and won't they penalise due to same content being indexed? And will it a better decision to reinclude the deindexed page again ?
Nope wont hurt a bit - they may send suggestions in your Google Webmaster Tools - but I ignore them... they tend to remove my Tag pages from Google search results, but that's OK cos they steal places in Google for my posts, generally Google will only allow so many pages from each site - depending on variables, like site age, sub domains etc. The best way to tell Google what to index is with a sitemap generator and then submit the sitemap into your Google webmaster tools - I use: Wordpress Plugins (search the exact name from your Plugins Install option in admin): BWP Google XML Sitemaps - and I tweak it to exclude stuff from the Sitemap - I exclude Gallery Tags, Galleries and Image tags in the options page. Udinra All Image Sitemap - Use this as an image sitemap - it's gold. Make sure you add a link in your footer or somewhere so Google can find the maps - and other search engines. Note: If you're worried about tags etc. being duplicate, remove them in your options for the sitemap.