URL restricted by robots.txt

Discussion in 'Search Engine Optimization' started by swappy_18, Aug 9, 2009.

  1. #1
    Everytime I submit my Sitemap to Google..I get this error

    We encountered an error while trying to access your Sitemap. Please ensure your Sitemap follows our guidelines and can be accessed at the location you provided and then resubmit.

    Any help...
     
    swappy_18, Aug 9, 2009 IP
  2. luckymurari

    luckymurari Active Member

    Messages:
    629
    Likes Received:
    40
    Best Answers:
    0
    Trophy Points:
    90
    #2
    Can u give your sitemap's url? I may check what the problem is




    In addition, chewck your sites <yousitename>.com/robots.txt file
     
    Last edited: Aug 9, 2009
    luckymurari, Aug 9, 2009 IP
  3. adithya

    adithya Well-Known Member

    Messages:
    568
    Likes Received:
    11
    Best Answers:
    0
    Trophy Points:
    110
    #3
    Are u using wordpress or any CMS ..?
    Please provide us the link to the sitemap and we can find out any problem
     
    adithya, Aug 9, 2009 IP
  4. jitendraag

    jitendraag Notable Member

    Messages:
    3,982
    Likes Received:
    324
    Best Answers:
    1
    Trophy Points:
    270
    #4
    If you can share your URL we might be able to give you exact solutions rather than shooting arrows in dark.
     
    jitendraag, Aug 9, 2009 IP
  5. chevchelios

    chevchelios Active Member

    Messages:
    116
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    53
    #5
    Yeah.. leave the url from your sitemap.
     
    chevchelios, Aug 9, 2009 IP
  6. abluegrape

    abluegrape Peon

    Messages:
    1,029
    Likes Received:
    10
    Best Answers:
    0
    Trophy Points:
    0
    #6
    I have had this with adsense, I followed the insructions and it still shows as eroor but still get credited if someone clicks a advert
     
    abluegrape, Aug 9, 2009 IP
  7. swappy_18

    swappy_18 Peon

    Messages:
    220
    Likes Received:
    2
    Best Answers:
    0
    Trophy Points:
    0
    #7
    swappy_18, Aug 9, 2009 IP
  8. luckymurari

    luckymurari Active Member

    Messages:
    629
    Likes Received:
    40
    Best Answers:
    0
    Trophy Points:
    90
    #8
    as i expected tthe problem is in ur robots.txt file .. you are disallowwing google through it. Go to robots.txt on ur domain nd remove Disallow:/
     
    luckymurari, Aug 9, 2009 IP
  9. PhotonElectron

    PhotonElectron Peon

    Messages:
    26
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #9
    Yep, that's a nasty error to make, but an easy one I've made before, always be careful with the disallow.
     
    PhotonElectron, Aug 9, 2009 IP
  10. jitendraag

    jitendraag Notable Member

    Messages:
    3,982
    Likes Received:
    324
    Best Answers:
    1
    Trophy Points:
    270
    #10
    That's just the default privacy setting problem with your wordpress installation. Go to Settings -> Privacy and allow search engines to crawl your blog.

    Right now all your pages also have a meta nofollow,noindex.
     
    jitendraag, Aug 9, 2009 IP
  11. swappy_18

    swappy_18 Peon

    Messages:
    220
    Likes Received:
    2
    Best Answers:
    0
    Trophy Points:
    0
    #11
    swappy_18, Aug 9, 2009 IP
  12. luckymurari

    luckymurari Active Member

    Messages:
    629
    Likes Received:
    40
    Best Answers:
    0
    Trophy Points:
    90
    #12
    It looks all fine now. Need not change all that .. ofcourse, if you want to then copy paste that code in robots.txt in ur wordpress folder
     
    luckymurari, Aug 10, 2009 IP
  13. mwoeppel

    mwoeppel Peon

    Messages:
    12
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #13
    I'm having the same issue; but when I fixed the sitemap, the wordpress settings, uploaded the new robots.txt, Google still says robots.txt is blocking my domain. umm, no I'm not!

    Any ideas why?

    Here's what's in the file:

    User-agent: *
    Allow: /*
    Allow: /wp-content/uploads

    # Google Image
    User-agent: Googlebot-Image
    Disallow:
    Allow: /*

    # Google AdSense
    User-agent: Mediapartners-Google*
    Disallow:
    Allow: /*

    # Internet Archiver Wayback Machine
    User-agent: ia_archiver
    Disallow: /

    # digg mirror
    User-agent: duggmirror
    Disallow: /


    Sitemap: [http stripped due to low post count]data-protectiononline.com/sitemap.xml
    Sitemap: [http stripped due to low post count]data-protectiononline.com/sitemap.xml.gz
     
    mwoeppel, Aug 10, 2009 IP
  14. luckymurari

    luckymurari Active Member

    Messages:
    629
    Likes Received:
    40
    Best Answers:
    0
    Trophy Points:
    90
    #14
    Change all Allow: /* to Allow: / .. there is no wildcard character in robots specification (as much as I know)
     
    luckymurari, Aug 10, 2009 IP
  15. mwoeppel

    mwoeppel Peon

    Messages:
    12
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #15
    That did the trick! thanks!
     
    mwoeppel, Aug 10, 2009 IP
  16. margulies

    margulies Peon

    Messages:
    45
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #16
    My solution to allow crawling of the search engines
    was found in my Privacy settings in wordpress
    witched it to ALLOW
    and all is well. :cool:
     
    Last edited: Aug 27, 2009
    margulies, Aug 27, 2009 IP
  17. sydzapp

    sydzapp Well-Known Member

    Messages:
    703
    Likes Received:
    12
    Best Answers:
    0
    Trophy Points:
    100
    #17

    Don't copy that, you'll make things worse. Just let your robots.txt be how it is i.e.

    User-agent: *
    Disallow:
    
    Sitemap: [url]http://celebfickle.com/sitemap.xml.gz[/url]
    Code (markup):
    This is the standard robots.txt lines used by 90%+ sites. If your not too well versed with SEO, just install a seo plugin 'All-in one seo will be good enough'.
     
    sydzapp, Sep 5, 2009 IP
  18. karpok

    karpok Active Member

    Messages:
    325
    Likes Received:
    3
    Best Answers:
    0
    Trophy Points:
    60
    #18
    karpok, Sep 5, 2009 IP
  19. sydzapp

    sydzapp Well-Known Member

    Messages:
    703
    Likes Received:
    12
    Best Answers:
    0
    Trophy Points:
    100
    #19
    Doesn't matter, infact .xml.gz sitemaps are better than the conventional .xml sitemaps.
     
    sydzapp, Sep 5, 2009 IP
  20. adithya

    adithya Well-Known Member

    Messages:
    568
    Likes Received:
    11
    Best Answers:
    0
    Trophy Points:
    110
    #20
    Can you justify the answer ...?
    gz involves decompressing and then reading the xml file for Search Engines .. how come they are more conventional ..?
     
    adithya, Sep 6, 2009 IP