1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.
  2. Better Analytics for WordPress Get It Free

How To Prevent Search Engines From Spidering A Page & PDF File?

Discussion in 'Site & Server Administration' started by bad_bob00, Feb 28, 2011.

  1. #1
    Hi there,

    Say I've got a website at http://www.xyz.com and I've got a webpage which is giving away free material which I don't want other people to find through the search engines, as well as a PDF file that I don't want people finding, is there a way of preventing the search engine from finding the 2 pages?

    I had a look and guessed that this might work in the .htaccess file but I'm not completely sure:
    I want the search engines to spider my index page just not the files listed above.


    Thanks for any help :)
     
    bad_bob00, Feb 28, 2011 IP
  2. blackdata

    blackdata Greenhorn

    Messages:
    18
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    13
    #2
    yep, this should work.
    You can also host the file on another server and lock it in a rar.
     
    blackdata, Feb 28, 2011 IP
  3. bad_bob00

    bad_bob00 Active Member

    Messages:
    3,473
    Likes Received:
    56
    Best Answers:
    0
    Trophy Points:
    90
    #3
    Ah okay cheers I'll upload the htaccess file as it is then. Do you know if theres any way of checking that its worked?

    Thanks
     
    bad_bob00, Mar 3, 2011 IP
  4. bad_bob00

    bad_bob00 Active Member

    Messages:
    3,473
    Likes Received:
    56
    Best Answers:
    0
    Trophy Points:
    90
    #4
    Having a few problems if anyone can help; uploading the file above, with a slight modification, and it won't open up the download-ebook.html page.

    This is the file i'm trying to upload:
    (I removed the line linking to the ebook because I zipped it - so I'm guessing search engines can't access it now?)

    Am grateful for any help...
     
    bad_bob00, Mar 3, 2011 IP
  5. designer23

    designer23 Peon

    Messages:
    17
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #5
    listne...that doesnt matter...search engine doesnt crawl pdfs anyway untill u have a link to it....good that u have zipped it...

    the best way to do this is to....

    make a directory on the root of ur server > name it something like downloads or whatever......then keep a page "index.html" and all other things in it.....now say u want to cut access to this directory...just rename the directory...thats it...that will redirect all the links to it to a 404 error page...and nobody other than u can access it...simple!...;)

    i wont tinker with .htaccess too much...i had a very bad experience with it...i had all my sites (12 of them redirect to 500 :p )
     
    designer23, Mar 3, 2011 IP
  6. shend923

    shend923 Peon

    Messages:
    29
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #6
    Pretty much what the other posters recommended. I have a domain that is used for nothing other than hosting downloads for other sites. The index for this domain is blank, nothing on it, no links, nothing. Various directory's are set up to store a particular download in a zip file. This however will not keep others from sharing the link to your file, for that something like DL Guard is advised.
     
    shend923, Mar 3, 2011 IP
  7. bad_bob00

    bad_bob00 Active Member

    Messages:
    3,473
    Likes Received:
    56
    Best Answers:
    0
    Trophy Points:
    90
    #7
    I'm not bothered about protecting the pdf file now as its zipped up so should be fine, its just the download page that I'd like search engines to not access if thats possible, just because people might search for "download xyz pdf" and they'll get taken to the download page and bypass the payment page...

    Would it not be easier to use the htaccess file?


    Thanks for the help
     
    bad_bob00, Mar 4, 2011 IP
  8. ACME Squares

    ACME Squares Peon

    Messages:
    98
    Likes Received:
    3
    Best Answers:
    0
    Trophy Points:
    0
    #8
    This really sounds like it needs to be behind a login screen. With a cookie session, you could control who you give access to the download page from the buy page, and there is zero chance of a search engine stumbling in.
     
    ACME Squares, Mar 4, 2011 IP
  9. bad_bob00

    bad_bob00 Active Member

    Messages:
    3,473
    Likes Received:
    56
    Best Answers:
    0
    Trophy Points:
    90
    #9
    To be honest that sounds a bit too complicated for me though :(
    At the moment I've just set up a paypal page providing a link to pay, then they get redirected to a download page. Should I be doing things differently?...


    Thanks
     
    bad_bob00, Mar 4, 2011 IP
  10. ACME Squares

    ACME Squares Peon

    Messages:
    98
    Likes Received:
    3
    Best Answers:
    0
    Trophy Points:
    0
    #10
    It depends on how much of a problem people sharing the download link could be for you.
    If you don't see that as a problem, there probably isn't a need for a more secure, but complicated solution.
     
    ACME Squares, Mar 4, 2011 IP
  11. bad_bob00

    bad_bob00 Active Member

    Messages:
    3,473
    Likes Received:
    56
    Best Answers:
    0
    Trophy Points:
    90
    #11
    There shouldn't be much of a problem, just dont want it to be over complicated
     
    bad_bob00, Mar 4, 2011 IP
  12. bad_bob00

    bad_bob00 Active Member

    Messages:
    3,473
    Likes Received:
    56
    Best Answers:
    0
    Trophy Points:
    90
    #12
    Wondering if anyone can help with my current problem - can't get any pages on my website to work, keep getting a 500 Internal Server Error, its because of my .htaccess file which contains the following:

    I'm not sure why it won't let me access any of the pages?


    Thanks for any more help
     
    bad_bob00, Mar 7, 2011 IP
  13. designer23

    designer23 Peon

    Messages:
    17
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #13
    told u friend do not mess with .htaccess file....well if u can delete the file then delete and restore a previous version, or just erase the code u have written. .....

    Now here is an idea:

    password protect the rar file.
    let paypal direct users to a download page where the rar file is.....now break this action to two steps....
    ie file download...

    2) password download...

    for password download set the terms as follows:

    tell the users to mail you to your e-mail id confirming the "unique paypal transaction code"

    (as soon as you get that match with paypal info then send them the key via mail ;) )
    always tell users to wait from 24-48 hrs of form submission

    this is manual method but if u want to do it in a automated process u will have to keep a cookie session acting on user login and database integration....
     
    Last edited: Mar 7, 2011
    designer23, Mar 7, 2011 IP
    bad_bob00 likes this.
  14. ACME Squares

    ACME Squares Peon

    Messages:
    98
    Likes Received:
    3
    Best Answers:
    0
    Trophy Points:
    0
    #14
    This is a robots exclusion directive, and ONLY goes in robots.txt:
    User-Agent: *
    Disallow: / download-ebook.html

    .htaccess is for Apache configuration. It has a very strict syntax, and anything out of place will cause a 500 error.
     
    ACME Squares, Mar 9, 2011 IP
    bad_bob00 likes this.
  15. bad_bob00

    bad_bob00 Active Member

    Messages:
    3,473
    Likes Received:
    56
    Best Answers:
    0
    Trophy Points:
    90
    #15
    Hi again,

    Sorry to bump up an old thread. I was just wondering if ACME Squares (or anyone else) knows if this code is okay for the robots.txt file. I added another file in but wasn't sure if it was correctly formatted:
     
    bad_bob00, Apr 13, 2011 IP