Help with using wildcard for dynamic URL

AlfaNet Peon

Messages:: 2

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 0

#1

Hi all,
I run a script which generates plenty of pages with mtdomain.tld/?g_XXXXX, which are dupe contents.

I think these cold be blocked with the following lines in robot.txt file:
User-agent: *
Disallow: /*?
Code (markup):
But I need to Allow bots to crawl pages starts only with /?g_page=XX. So I'm thinking of robot.txt as below:
User-agent: *
Disallow: /*?
Allow: /?g_pages=*
Code (markup):
is the above order correct? or I need to put "Allow" first?

Will the above lines tell bots to follow ONLY urls having "/?g_page" in them and not any other urls with /?g ?

Any suggestion will be appreciated.

AlfaNet, Apr 29, 2008 IP

mistoovrool Banned

Messages:: 202

Likes Received:: 5

Best Answers:: 0

Trophy Points:: 0

#2

In google webmaster tool one facility available called
Analyze Robots.txt
From there you can place your robots.txt code and check whether crawler crawl that page or not.

mistoovrool, May 13, 2008 IP

mamina Active Member

Messages:: 316

Likes Received:: 4

Best Answers:: 0

Trophy Points:: 60

#3

From what I understand wildcards cannot be used in Robots.txt files. If I am wrong please let me know but that is what I have read.

Zelo

mamina, Jun 12, 2008 IP

manish.chauhan Well-Known Member

Messages:: 1,682

Likes Received:: 35

Best Answers:: 0

Trophy Points:: 110

#4

mamina said: ↑

From what I understand wildcards cannot be used in Robots.txt files. If I am wrong please let me know but that is what I have read.

Zelo
Click to expand...

Wildcards can be used in robots.txt as Google and yahoo bot supports and follow the wildcards in robots.txt..but MSN doesn't support the wildcards in robots.txt...

manish.chauhan, Jun 13, 2008 IP

Log in or Sign up

Help with using wildcard for dynamic URL

AlfaNet Peon

mistoovrool Banned

mamina Active Member

manish.chauhan Well-Known Member

Useful Searches