Advanced Robots.txt to Avoid Duplicate Content

Discussion in 'Search Engine Optimization' started by twalters84, Oct 16, 2007.

  1. #1
    Hey there,

    SEO consultants usually advise to avoid long URL strings when it comes to dynamic queries.

    Having said that, sometimes it is hard to avoid. For example, if you have a coder search that can return thousands of coders, you will want to split this up into hundreds of pages.

    For instance, here is one that I am making:

    http://www.codebuyers.com/find-a-coder.cfm

    This search features on the left may cause duplicate content to appear in the results. So that got me thinking about what I could do about it.

    I came across a google article that talks about pattern matching in the robots.txt file.

    http://www.google.com/support/webmasters/bin/answer.py?answer=40367

    So here is what I did in the robots.txt file on my website:

    Disallow: /find-a-coder.cfm?onlineCoders=*
    Disallow: /find-a-coder.cfm?newestCoders=*
    Disallow: /find-a-coder.cfm?search=*

    If I understand the article correct, this should allow me to avoid duplicate content. Googlebot should only crawl all pages of the find-a-coder.cfm page when the URL variable is set to "topCoders=1". Is this assumption correct?

    Furthermore, the google article states that pattern matching is an extension of the standard. Google follows it, so thats great. What about yahoo and msn? How do they follow pattern matching in the robots.txt file?

    Any advice will be appreciated. Thanks in advance.

    Sincerely,
    Travis Walters
     
    twalters84, Oct 16, 2007 IP
  2. Alexander the Great

    Alexander the Great Peon

    Messages:
    253
    Likes Received:
    11
    Best Answers:
    0
    Trophy Points:
    0
    #2
    Just curious about the quality of your service, but shouldn't you be find[ing] a coder to tell you the answer?
     
    Alexander the Great, Oct 16, 2007 IP
  3. twalters84

    twalters84 Peon

    Messages:
    514
    Likes Received:
    7
    Best Answers:
    0
    Trophy Points:
    0
    #3
    Hey there,

    Well, I figured I would give SEO consultants a shot to see if they encountered anything like this before.

    The reason I say that is because of the duplicate content issue. If I block certain parts of a search, then google should crawl the more relevant parts of my website more often.

    As far as quality goes, I want to have the best. That is why I am taking my time before releasing the product. It will probably not be released for another 5 months or so.

    Sincerely,
    Travis Walters
     
    twalters84, Oct 16, 2007 IP