1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

How to stop Google Spidering my Javascript...

Discussion in 'robots.txt' started by misohoni, Aug 31, 2005.

  1. #1
    I put in a robots tag and it still seems to spider it after 3 months of wait, any ideas? Put in a redirect page on the directory index also
     
    misohoni, Aug 31, 2005 IP
  2. simplexity

    simplexity Peon

    Messages:
    198
    Likes Received:
    4
    Best Answers:
    0
    Trophy Points:
    0
    #2
    I thought google couldn't spider javascript. Well shows how much i know ;)
    Does your robot.txt look like this

    Because i this is what you need :)
     
    simplexity, Sep 1, 2005 IP
  3. misohoni

    misohoni Notable Member

    Messages:
    1,717
    Likes Received:
    32
    Best Answers:
    0
    Trophy Points:
    200
    #3
    I thought it's robots.txt? Yes my setup looks like that
     
    misohoni, Sep 1, 2005 IP
  4. DangerMouse

    DangerMouse Peon

    Messages:
    275
    Likes Received:
    11
    Best Answers:
    0
    Trophy Points:
    0
    #4
    You can do it using htaccess...

    
    RewriteEngine    on
    RewriteCond    %{HTTP_REFERER} !^$
    RewriteCond    %{HTTP_REFERER} !^http://([-a-z0-9]+\.)?yourdomain\.com [NC]
    RewriteRule    \.(js|vbs)$ - [F,NC,L]
    
    Code (markup):
    This would stop any access where your domain wasn't in the referrer field. :cool:

    (this is useful as not all robots obey the robots.txt guidelines - even googlebot!)
     
    DangerMouse, Sep 1, 2005 IP
  5. minstrel

    minstrel Illustrious Member

    Messages:
    15,082
    Likes Received:
    1,243
    Best Answers:
    0
    Trophy Points:
    480
    #5
    No it isn't. That tells spiders to disallow (ignore) everything. That's generally the last thing you want or need.

    It IS robots.txt and if yours actually looks like that it needs work fast.

    Which site has the robots.txt file in question? Post a URL and the relevant lines from your .htaccess file.
     
    minstrel, Sep 1, 2005 IP
  6. simplexity

    simplexity Peon

    Messages:
    198
    Likes Received:
    4
    Best Answers:
    0
    Trophy Points:
    0
    #6
    Oops sorry misohoni, Please forgive me. That teaches me for doing multiple things at once! :eek:

    DangerMouse thats quite interesting, thanks
     
    simplexity, Sep 2, 2005 IP
  7. draculus

    draculus Peon

    Messages:
    63
    Likes Received:
    3
    Best Answers:
    0
    Trophy Points:
    0
    #7
    You can place your javascript into an external file, place the file in a folder and then disallow access to that folder in your robots.txt.

    This has two benefits:

    1. It reduces the code to content overhead;
    2. Any bots than can and do read javascript will never see it.
     
    draculus, Sep 18, 2005 IP
  8. scottj

    scottj Peon

    Messages:
    168
    Likes Received:
    4
    Best Answers:
    0
    Trophy Points:
    0
    #8
    I prevent any robot from spidering my CSS and JS files by using this:

    User-agent: *
    Disallow: /css
    Disallow: /js

    I put all CSS and JS files in the /css and /js directories and voila, no more spiders on those files. :)

    -Scott
     
    scottj, Nov 10, 2005 IP
  9. DangerMouse

    DangerMouse Peon

    Messages:
    275
    Likes Received:
    11
    Best Answers:
    0
    Trophy Points:
    0
    #9
    This implies that spiders adhere to the robots.txt standard... I have found this not to be the case!
     
    DangerMouse, Nov 11, 2005 IP