Hi, Can anyone tell me please how to stop indexing of the unwanted and automatically script generated pages in Google. Also how to de-index such pages? Any help will be appreciated. Regards WM
Add a special character to the link and disallow it in robots.txt . Since our shopping cart links all have a question mark I use: Disallow: /*? wiz
301 is definitely the way to go, especially if you have inbound links to the unwanted pages. That will preserve most of your rank and transfer it to the actual page.
You can do with that in two ways..! ! By doing with ROBOT.TXT !! "No follow" attribute to the page or image that you don't want to index...!
Identify site pages from google webmaster tools and use tags "no index" and "no follow" to completely remove pages from google index.
You have two ways to do, following those easy ways: 1: Go to your root folder, there must have robots.txt file and put this line: Disallow: / your page name or folder name (which you want to no index from search engine) 2: On your page meta section write '<META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">' Hope this will help you out. Enjoy...
Two methods I know for de-indexing and stopping crawlers to crawl webpages. Robots.txt User-agent: *(name of bot) Disallow: (name of a file followed by /) If you want to block particular directory than Disallow: /directory-name/ will work Meta Robots tag <meta name="robots" content="noindex, nofollow" /> <meta name="robots" content="index, nofollow" /> <meta name="robots" content="noindex, follow" />
How can you call yourself "WEBMASTER" if you cant answer your own basic questions. 301 is definitely the way to go, especially if you have inbound links to the unwanted pages. That will preserve most of your rank and transfer it to the actual page. As far as blocking goes, you can use a robots.txt or a robots meta tag on your page with a NOFOLLOW,NOINDEX
Here are two easy resolution and fast to stop search engines from indexing your site - 1st - Use a specific meta tag For each page that you don't want to appear in search engine results, have only one <meta> tag. Not a description, not some keywords, only a single <meta> tag for robots. <meta name="robots" content="noindex,nofollow,noarchive" /> Put that in <head> of each page, and you're telling search engines not to index the page, do not follow any links on the page and do not archive the page. 2nd - Create a robots.txt file If the pages are in a separate directory, you can also block using a robots.txt file to search engines. Create a text file and in it, are not all of the directories that you want to protect: User-agent: *Disallow: /nameofdirectory Disallow: /anothernameofdirectory Do it for all the directories that you want, and then save the file as robots.txt file and upload it to the root directory. Search engine bots will hit your robots.txt file, find out which directories you do not want them in and skip them. So here you go. Two little things that can save the world from the problem. Take a choice you wish to do and have fun!