404 Crawler Access Error - Google is not indexing my site

Discussion in 'Google Sitemaps' started by ffgallery, Sep 22, 2009.

  1. #1
    Hey all, it's been a month and my site has still not been indexed, i submitted my sitemap and it downloaded successfully in Google Webmasters. Though within Webmaster tools/Crawler Access i am getting a " 404(not found) error

    Here is the exact error message:

    This site is not located at the top level for the domain. A robots.txt file is only valid when located in the highest-level directory and applies to all directories within the domain. The robots.txt file that applies to your site (if one exists) is located at http://albulm.net/robots.txt. This page provides information on that file.

    This is true, my website is http://albulm.net/videogames, and not at the root level which is http://albulm.net

    There is a robots.txt file within the http://albulm.net/videogames where my site is held

    If i do a test, webmasters successfully reached my robots.txt file using the URL i specified which is http://albulm.net/videogames

    Any ideas?

    On a side note, my robots.txt file disallows access to every directory on my website by default...is this a problem for indexing?
     
    ffgallery, Sep 22, 2009 IP
  2. ffgallery

    ffgallery Peon

    Messages:
    9
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #2
    is it ok to allow robots.txt file access to all files within your website? right now everything is dissallowed....

    Here is what my robots.txt file looks like:

    User-agent: *
    Disallow: /*site name*/*subfolder*/
    Disallow: /*site name*/*subfolder*/
    Disallow: /*site name*/*subfolder*/
    Disallow: /*site name*/*subfolder*/
    Disallow: /*site name*/*subfolder*/
    Disallow: /*site name*/*subfolder*h/
    Disallow: /*site name*/*subfolder*/
    Disallow: /*site name*/*.php
    Disallow: /*site name*/*.php
    Disallow: /*site name*/*.php
    Disallow: /*site name*/*.php
    Disallow: /*site name*/*.php
     
    ffgallery, Sep 22, 2009 IP
  3. lakkineni

    lakkineni Peon

    Messages:
    26
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #3
    It is recommended to use robots.txt file at the root of the site like, domain.com/robots.txt

    you can disallow or allow subfolders from the root.
     
    lakkineni, Sep 23, 2009 IP
  4. ffgallery

    ffgallery Peon

    Messages:
    9
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #4
    Thanks - I have created a new robots.txt file, and placed it in the root of my website - here is the contents of my new robots.txt file:

    User-agent: *
    Allow: /

    I have also removed my old robots.txt file in the subdirectory,

    Is allowing crawler full access to my site using the above code OK or pose a security risk?

    It's just that Google does not seem to be able to index my site and i hope this was the cause of it...
     
    ffgallery, Sep 23, 2009 IP