AlfaNet
Apr 29th 2008, 3:25 am
Hi all,
I run a script which generates plenty of pages with mtdomain.tld/?g_XXXXX, which are dupe contents.
I think these cold be blocked with the following lines in robot.txt file:
User-agent: *
Disallow: /*?
But I need to Allow bots to crawl pages starts only with /?g_page=XX. So I'm thinking of robot.txt as below:
User-agent: *
Disallow: /*?
Allow: /?g_pages=*
is the above order correct? or I need to put "Allow" first?
Will the above lines tell bots to follow ONLY urls having "/?g_page" in them and not any other urls with /?g ?
Any suggestion will be appreciated.
I run a script which generates plenty of pages with mtdomain.tld/?g_XXXXX, which are dupe contents.
I think these cold be blocked with the following lines in robot.txt file:
User-agent: *
Disallow: /*?
But I need to Allow bots to crawl pages starts only with /?g_page=XX. So I'm thinking of robot.txt as below:
User-agent: *
Disallow: /*?
Allow: /?g_pages=*
is the above order correct? or I need to put "Allow" first?
Will the above lines tell bots to follow ONLY urls having "/?g_page" in them and not any other urls with /?g ?
Any suggestion will be appreciated.