Hello, folks! I have a page that I don't want to get indexed. I added the <meta name="robots" content="no index,no follow"> but the page still got indexed... What would be the easiest way to remove it from Google/prevent it from getting indexed? Thanks, Alex
Robots.txt User-Agent: * Disallow: /page-name.html Disallow: /page-name2.html Allow: / That would probably be the best way.. Add another Disallow: rule on a new line for every page you don't want indexed! Or: Disallow: /subdirectory/ to block a whole directory Cheers James
Sorry but I am a complete idiot when it comes to... I am shamed to admit, even basic html... SO where exactly do I add this? Or do I create a Robot.txt document, with content being: User-Agent: * Disallow: /page-name.html Allow: / and then update it to the server? Thank You!
There has to be a file named robots.txt at the root of the server. If not you will have to create it with a text editor. Basically start a new file, write the pages or directories you do not want indexed and copy the file to the root of the server. Search engines follow the instructions in the robots.txt
Yep! you got it.. here ya go. Right click and save as. http://www.uvrx.com/alexs464.txt Change page-name.html to the page you want blocked, and change the name of the file to robots.txt then upload it to the root directory Cheers James
Yeah, me too and I'm watching this thread with interest. I had the same question and I am just as untechie as you are.
You should only do it this way if you are sure you do not want the page indexed anytime within the next 6 months.. you have to be very careful doing that.. if you accidentally remove a main folder, then you end up wiping your whole site out of the index for a very long time. "not recommended" James
One more question The page name, do I have to type http://www.xyz.com/mypage.html - or can I simply type mypage.html ?
Ah? I read the first post that he seems to be desperately don't want the page to be indexed... Maybe it's a download page or what... But the remove url is per url isn't it? It's only remove http://domain.com/theurl and not http://domain.com Correct me if I'm wrong... I never removed any url actually... Why would we? If the page is a page like download page, we can use password protect, can we?
I do use password protect for other pages. But this particular page - I don't want to password protect it. I just want it not indexed for a few weeks... because later on I will change the content and I WOULD want it to be indexed.
Ah, I see... Then you SHOULD NOT remove it from Google Webmasters Tools... But, if it's already indexed, people still find it anyway, "accidentally"...
Thanks! if the page is in the root folder then just like in the robots.txt. EG: Disallow: /page-name.html If the page is in a sub folder then Disallow: /subfolder-name/page-name.html if you went.. Disallow: /subfolder/ and did not supply a page-name.html, you would block all pages in that subfolder. You can also include this on the last line. sitemap: http://www.yourwebsite.com/sitemap.xml Code (markup): (include the full address.) That will tell the robots where to find your sitemap So in the end it will look like this User-Agent: * Disallow: /page-name.html Allow: / sitemap: http://www.yourwebsite.com/sitemap.xml Code (markup): Cheers James
if it's already indexed, you could add a redirect until it is ready for primetime to be indexed: in the head tags of the page add: <meta http-equiv="refresh" content="0; url=http://www.domain.com/otherpage.html"> Code (markup):