Hi one of my website have 500 + pages all page index now in google and other search engine due to some reason i want only home page ( index page ) should come in google i do not want any other pages in google what detail i need to write in robot.txt file ? make sure no other page should come rather then main index page
I am not not sure but you can try this, if it works out. User-agent: * Allow: /index.html (or use home page extension or url) Disallow: / Code (markup):
It is safer, but it need some extra work - use this code in every page that you don't want to get index: <title>...</title> <META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW"> </head> But, use it with caution, because if your inner pages get deindexed, your main page will lose some major link juice from other pages and thus, it can hurt its ranking.
Do it like this: User-Agent: * Disallow: / Disallow: / "the page you don't want google bots to crawl" Disallow: / "the page you don't want google bots to crawl" Input the link in after the / it means if you have abc.com/aap.php and you don't want to index it the type "Disallow:/aap.php" like wise you can move on.
If you have .html pages: == User-agent: Googlebot Disallow: *.html == If you have .php pages: == User-agent: Googlebot Disallow: *.php == If you want to deny for all search engines, you will have to to list in Disallow all characters: == User-Agent: * Disallow: /0 ... Disallow: /9 Disallow: /a ... Disallow: /z == Not very succinctly, but should work in all search engines.