Sitemap issue - Removed URLs still indexed!

Discussion in 'Google Sitemaps' started by SMseo, Apr 23, 2009.

  1. #1
    I had a static website (45 Pages) a week ago. I redeveloped it in PHP + MySQL and it is now a dynamic website with around 11000 pages. I have around 400 categories and 3500 products wherein each product is available in many categories and accordingly generates a URL. Hence 11000 URLs. My Questions are:

    a. My Robots.txt has around 7500 URLs & a size of 650 KB. Is this acceptable? What is the limit (URLs or Size)?
    I had to do that because I did not want to face dupe issues with Google. Though I have added the meta noindex tag too to all these pages. Will this work as good as the robots.txt?

    b. I deleted all my previously indexed 45 pages (they were not sending me any traffic anyways so I did not worry). Did I do something wrong? since these URLs show up in "Errors for URLs in Sitemaps", "Not found", "URLs not followed" in the Webmasters Tool. I used the URL removal tool to remove them but apparently they are still present there as I can still see them in search results and as also Webmasters Tool is reporting them!

    c. I tried removing the /sitemap.html file also as I deleted that too from the root directory but the Removal Tool reports it as a Denied action.

    d. I have submitted & re-submitted the Sitemap but even when the Webmaster Tool shows it as "successfully crawled" I dont see the pages indexed yet!

    e. MOST IMPORTANT - The "Analyze Robots.txt" Tool shows me "Googlebot is blocked from http://www.mywebsite.com/" ! What could be the possible reason for this. Other than URLs, I am using the matching pattern - Disallow: /*?.

    Please help me as I am really worried about all these problems. I have spent many $$$ for this website to be developed but it seems that I did something terribly wrong.

    Thanks in advance.

    My website is www(dot)sameday-flowerdelivery(dot)com
     
    SMseo, Apr 23, 2009 IP
  2. themedseoservices

    themedseoservices Guest

    Messages:
    46
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #2
    hi,

    there is a option in google webmastertools to block the old urls.so that u can avoid using the robots.txt for top landing pages.
     
    themedseoservices, Apr 24, 2009 IP
  3. Lpe04

    Lpe04 Peon

    Messages:
    579
    Likes Received:
    15
    Best Answers:
    0
    Trophy Points:
    0
    #3
    wow, lots of questions here.

    You aren't even close to the max for number of urls so no worries.

    Give Webmaster tools time to catch up with you, like a week or so.

    what are you trying to block here?
    Disallow: /*?.

    Meta should work just as good as robots

    Cheers,
     
    Lpe04, Apr 28, 2009 IP