What Should You Make Sure Google Can't See?

Discussion in 'Google' started by pjman, May 31, 2010.

  1. #1
    I have a folder/directory on one of my sites where I try to sell digital downloads. It has something like a 1000 different sample pages. In the robot text, I stopped google from indexing this. Because:

    1. Every sample page in that directory is basically the same template for a different sample. The only thing that changes is a 3-4 sentence description of the sample.

    2. Every page has an outgoing link to a different domain where they can purchase the download. I didn't want this to be seen as excessive cross linking (that would be 1000 outgoing links tom the same site).

    3. The samples are all images. So there is really little content.

    Is blocking these pages from being indexed, the right move?
     
    pjman, May 31, 2010 IP
  2. webcosmo

    webcosmo Notable Member

    Messages:
    5,840
    Likes Received:
    153
    Best Answers:
    2
    Trophy Points:
    255
    #2
    if its a little summary page, i wont even bother.
     
    webcosmo, May 31, 2010 IP
  3. fishmania

    fishmania Peon

    Messages:
    388
    Likes Received:
    2
    Best Answers:
    0
    Trophy Points:
    0
    #3
    shouldn't be a problem.
     
    fishmania, May 31, 2010 IP
  4. looking4vps

    looking4vps Peon

    Messages:
    1,495
    Likes Received:
    6
    Best Answers:
    0
    Trophy Points:
    0
    #4
    i should cause no issues at all
     
    looking4vps, May 31, 2010 IP
  5. pjman

    pjman Active Member

    Messages:
    145
    Likes Received:
    3
    Best Answers:
    0
    Trophy Points:
    58
    #5
    Cool thanks. That makes me feel better.
     
    pjman, May 31, 2010 IP
  6. longcall911

    longcall911 Peon

    Messages:
    1,672
    Likes Received:
    87
    Best Answers:
    0
    Trophy Points:
    0
    #6
    Yes, IMO you should disallow the entire folder. Remember, disallowing really means 'do not index' so I wouldn't try to hide any text, links, spam or otherwise within those pages. Also, I would make sure the the page's meta tag for robots reads NOINDEX NOFOLLOW. That leaves you squeeky clean. :)
     
    longcall911, May 31, 2010 IP
  7. pjman

    pjman Active Member

    Messages:
    145
    Likes Received:
    3
    Best Answers:
    0
    Trophy Points:
    58
    #7
    Thanks. Yeah, meta tags would lock it up. Adding them now.
     
    pjman, May 31, 2010 IP
  8. DoDo Me

    DoDo Me Peon

    Messages:
    2,257
    Likes Received:
    27
    Best Answers:
    0
    Trophy Points:
    0
    #8
    block on robots.txt can make googlebots not see, but google will still claw as part of performance check and anti fraud.
     
    DoDo Me, May 31, 2010 IP
  9. social-media

    social-media Member

    Messages:
    311
    Likes Received:
    9
    Best Answers:
    0
    Trophy Points:
    35
    #9
    Actually disallowing means "Do not crawl"... That is MUCH different than "Do not index". Pages that have been disallowed can still show in Google's index and search results if enough other sites link to it and the link text makes Google feel it is a relevant result. The best way to prevent indexing is <meta name="robots" content="noindex">.
     
    social-media, May 31, 2010 IP
  10. microman007

    microman007 Peon

    Messages:
    317
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #10
    I agree with what. Social Media said.
     
    microman007, Jun 1, 2010 IP
  11. point001

    point001 Peon

    Messages:
    10
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #11
    meta tags would lock it up , i hope
     
    point001, Jun 1, 2010 IP
  12. edith hadiansyah

    edith hadiansyah Active Member

    Messages:
    191
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    75
    #12
    i think the robot.txt is enough.
     
    edith hadiansyah, Jun 1, 2010 IP
  13. pjman

    pjman Active Member

    Messages:
    145
    Likes Received:
    3
    Best Answers:
    0
    Trophy Points:
    58
    #13
    Just wanted to follow this one up. We had the correct answer, but we want to be sure, we have the whole answer.

    If you stop google from indexing and following a page, in robot text, that's not enough. You need to Add No Follow Meta Tags on the pages themselves to completely stop any PR juice loss from the pages that are linking to it.
     
    pjman, Jun 3, 2010 IP
  14. SearchBliss

    SearchBliss Well-Known Member

    Messages:
    1,899
    Likes Received:
    70
    Best Answers:
    2
    Trophy Points:
    195
    Digital Goods:
    1
    #14
    Disallow in the robots.txt file (Google caches the robots.txt and updates it about every 24 hours, so be sure to add it a day before you make the content live). If it is already live, also add the NOINDEX and NOFOLLOW to the robots meta tag. This will keep googlebot from indexing AND following the links on these pages.
    <meta name="robots" content="NOINDEX,NOFOLLOW">.
     
    SearchBliss, Jun 3, 2010 IP