Google Indexing Nonexistent Files

Discussion in 'robots.txt' started by Tandem, Apr 11, 2009.

  1. #1
    My Error Logs show quite a few 404 entries.
    The referer is usually google.gr, google.ee google.com.br etc.

    Please keep in mind, this is not a case of broken links, renamed or removed files. The files in question never existed on the sites (as far as I know).

    Also, the sites and the directories that the indexes point to have all robots.txt files with the following:
    User-agent: *
    Disallow: /

    The sites are for private use and are not indexed by SEs. I am aware that bots can ignore the robots.txt files.

    What concerns me is that the file names usually are something along the lines:
    ....serial-free.html
    ....CD-key-changer.html
    ...something-sex.html and so on.

    Does anyone have any ideas about what's is going on? How do these end up in google index?
     
    Tandem, Apr 11, 2009 IP
  2. yenerich

    yenerich Active Member

    Messages:
    697
    Likes Received:
    7
    Best Answers:
    0
    Trophy Points:
    75
    #2
    Fake referrers probably.
     
    yenerich, Apr 16, 2009 IP
  3. manish.chauhan

    manish.chauhan Well-Known Member

    Messages:
    1,682
    Likes Received:
    35
    Best Answers:
    0
    Trophy Points:
    110
    #3
    As you mentioned that you have restricted your whole website from search engines. Then it is not possible for some one to search it from there. It might be happen due to some one playing with your website url, might be by some spammers...:)
     
    manish.chauhan, Apr 17, 2009 IP
  4. Lpe04

    Lpe04 Peon

    Messages:
    579
    Likes Received:
    15
    Best Answers:
    0
    Trophy Points:
    0
    #4
    These could be backlinks to your site that Google is following.
     
    Lpe04, Apr 28, 2009 IP
  5. pitagora

    pitagora Peon

    Messages:
    247
    Likes Received:
    9
    Best Answers:
    0
    Trophy Points:
    0
    #5
    do these files appear to be in 777 chmoded folders? Check those folders and see if there is a .htaccess file doing some clever rewriting. I've seen some worms that exhibit the behavior you are describing.
     
    pitagora, May 8, 2009 IP