1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

How to Use Robots.txt to block GoogleBot, but not AdsBot

Discussion in 'Search Engine Optimization' started by hemanthjava, Feb 17, 2019.

  1. #1

    I am managing an eCommerce brand that has thousands of products. Some of these products have multiple SKUs (variants in terms of colour). These multiple SKU's use a URL query string parameter to differentiate between the colour variants. Since they are the same product, but only vary by colour, they are all canonicalised to the non-colour version for SEO.
    Example setup of products:

    Hugo Boss T-Shirt product page (/product/hugo-boss-red-t-shirt) with the below variants;

    /product/hugo-boss-red-t-shirt?colour=red - canonicalise to the non colour version
    /product/hugo-boss-red-t-shirt?colour=green- canonicalise to the non colour version
    /product/hugo-boss-red-t-shirt?colour=blue - canonicalise to the non colour version

    The problem we have is, there is a lot of crawl budget wasted by Google Bots in crawling these colour variants. I could add the Disallow: /*?colour rule to the robots.txt to prevent Googlebot from crawling these URL variants but they would cause problems for PPC PLAs (i think...).

    Can you please advice how I block these variants from being crawled only for search bots and not ad bots such that PLAs are not impacted by these rules on robots.txt?

    hemanthjava, Feb 17, 2019 IP
  2. mmerlinn

    mmerlinn Notable Member

    Likes Received:
    Best Answers:
    Trophy Points:
    Just add googlebot, but not adbot, to your list of denied sites in your robots file.
    mmerlinn, Feb 17, 2019 IP
  3. Manish.ebiztrait

    Manish.ebiztrait Active Member

    Likes Received:
    Best Answers:
    Trophy Points:

    Use this code in Robots.txt to block GoogleBot without Blocking AdsBot:

    User-agent: Googlebot
    Disallow: /

    User-agent: AdsBot-Google
    Allow: /
    Manish.ebiztrait, May 17, 2019 IP