G crawler indexed backoffice pages

Discussion in 'Search Engine Optimization' started by SaleemY, Sep 8, 2005.

  1. #1
    hello there

    The G crawler has indexed pages from my backoffice! They were in a directory called /affiliate in the main dir for the site.

    Should this folder have been within the /private dir?

    And what can I do to fix this without getting penalized by G for loads of pages disappearing?

    Do I need to customize the 404 page?

    Should I contact G to remove that dir?

    What about robots.txt or htaccess?

    Thanks!
     
    SaleemY, Sep 8, 2005 IP
  2. fryman

    fryman Kiss my rep

    Messages:
    9,604
    Likes Received:
    777
    Best Answers:
    0
    Trophy Points:
    370
    #2
    disallow that folder in your robots txt, and you can go to google's console to remove them from the index
     
    fryman, Sep 8, 2005 IP
  3. SaleemY

    SaleemY Peon

    Messages:
    64
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #3
    Google's console? Where is that?
     
    SaleemY, Sep 8, 2005 IP
  4. fryman

    fryman Kiss my rep

    Messages:
    9,604
    Likes Received:
    777
    Best Answers:
    0
    Trophy Points:
    370
    #4
    fryman, Sep 8, 2005 IP
  5. SaleemY

    SaleemY Peon

    Messages:
    64
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #5
    Thanks. And if I dont that, G will drop the pages automatically when it comes across the 404 right?
     
    SaleemY, Sep 8, 2005 IP
  6. fryman

    fryman Kiss my rep

    Messages:
    9,604
    Likes Received:
    777
    Best Answers:
    0
    Trophy Points:
    370
    #6
    just don't link to those pages and disallow google to access your folder and they should drop in a while
     
    fryman, Sep 8, 2005 IP
  7. lorien1973

    lorien1973 Notable Member

    Messages:
    12,206
    Likes Received:
    601
    Best Answers:
    0
    Trophy Points:
    260
    #7
    if the directory shows up in log files that are public on other sites it can still be crawled. using the robots.txt file or specifically setting an httpasswd on the directory is the best way to go (I'd recommend an htpasswd or other login function).
     
    lorien1973, Sep 8, 2005 IP