1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

Is my robots.txt working fine?

Discussion in 'robots.txt' started by aleale, Sep 10, 2014.

  1. #1
    I have a subdomain eg blog.example.com and i want this domain not to index by Google or any other search engine. I put my robots.txt file in 'blog' folder in the server with following configuration:

    User-agent: *
    Disallow: /

    Would it be fine to not to index by Google?
    SEMrush
    A few days before my site:blog.example.com shows 931 links but now it is displaying 1320 pages. I am wondering if my robots.txt file is correct then why Google is indexing my domain.

    If i am doing anything wrong please correct me.
     
    aleale, Sep 10, 2014 IP
    SEMrush
  2. Maninder Pal Singh

    Maninder Pal Singh Member

    Messages:
    34
    Likes Received:
    2
    Best Answers:
    0
    Trophy Points:
    38
    #2
    That should do the trick. The best practice is to have your robots.txt file this way before you launch the site. The reason Google may not be able to de-index it is because Google can find links to the website from other places and is crawling and indexing the website pages. However, it should read the robots file and slowly remove the pages from search results.

    What I'll recommend is to add the meta noindex tag as it is more effective in letting the crawler know that the page should not be indexed. If not instantly, it should atleast help in getting the site de-indexed faster.

    Here's what Google support says about it:

    To prevent most search engine web crawlers from indexing a page on your site, place the following meta tag into the <head> section of your page:

    <meta name="robots" content="noindex">

    To prevent only Google web crawlers from indexing a page:

    <meta name="googlebot" content="noindex">
     
    Maninder Pal Singh, Sep 11, 2014 IP
  3. maddenitrous

    maddenitrous Member

    Messages:
    190
    Likes Received:
    3
    Best Answers:
    1
    Trophy Points:
    33
    #3
    The code you have used will deindex all of your website pages, fodlers, etc, so what you did is not ok. Also, your robots.txt file should be in the root folder.

    This link should be helpful for you: http://www.robotstxt.org/robotstxt.html
     
    maddenitrous, Oct 3, 2014 IP