1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

403 forbidden

Discussion in 'Link Development' started by svetomir, Oct 3, 2005.

  1. #1
    Should I be linking to the site that consistently displays "403 forbidden" result with my broken link check tool? Will it have any negative effect on my site?:confused:
     
    svetomir, Oct 3, 2005 IP
  2. soul-healer

    soul-healer Peon

    Messages:
    1,459
    Likes Received:
    145
    Best Answers:
    0
    Trophy Points:
    0
    #2
    The site is currently blocking links that are comming from your domain name.

    Some site use this techinque to block certain websites to save bandwidth. If they are blocking links from your side which means your site vistors will not reach your site so is there is no point in linking to that site, logically you should remove the site link.
     
    soul-healer, Oct 3, 2005 IP
  3. svetomir

    svetomir Peon

    Messages:
    68
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #3
    I can actually navigate to that site from my directory. I was just a bit worried about word "forbidden". I also get 404 error results but when I click on links pages are there. So I am just a little bit confused.
    I think I'll get rid of the 403, if they don't want people linking to their site I am not going to waste my links.
     
    svetomir, Oct 3, 2005 IP
  4. minstrel

    minstrel Illustrious Member

    Messages:
    15,082
    Likes Received:
    1,243
    Best Answers:
    0
    Trophy Points:
    480
    #4
    I keep seeing people who post robots.txt files blocking link-checkers like Xenu.

    In my opinion, that's just plain dumb. I've used Xenu for years. If it tells me it can't connect to a page, I delete the link to that page. When I see people recommending blocking link-checkers, I ask them if that's what they are hoping to have happen...

    I would recommend you delete the link and email the webmaster to tell him why - educate the poor sod.
     
    minstrel, Oct 3, 2005 IP
  5. Storebuilder

    Storebuilder Peon

    Messages:
    13
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #5
    Xenu linksleuth is just one of a number of robots, spiders, harvesters etc that is "access denied" in the .htaccess file of phpnuke by default.

    If you can think of a good reason why it shouldn't be there then I'll listen.

    #The next lines check for Email Spammers Robots and redirect them to a fake page
    RewriteCond %{HTTP_USER_AGENT} ^Alexibot [OR]
    RewriteCond %{HTTP_USER_AGENT} ^asterias [OR]
    RewriteCond %{HTTP_USER_AGENT} ^BackDoorBot [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Black.Hole [OR]
    RewriteCond %{HTTP_USER_AGENT} ^BlackWidow [OR]
    RewriteCond %{HTTP_USER_AGENT} ^BlowFish [OR]
    RewriteCond %{HTTP_USER_AGENT} ^BotALot [OR]
    RewriteCond %{HTTP_USER_AGENT} ^BuiltBotTough [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Bullseye [OR]
    RewriteCond %{HTTP_USER_AGENT} ^BunnySlippers [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Cegbfeieh [OR]
    RewriteCond %{HTTP_USER_AGENT} ^CheeseBot [OR]
    RewriteCond %{HTTP_USER_AGENT} ^CherryPicker [OR]
    RewriteCond %{HTTP_USER_AGENT} ^ChinaClaw [OR]
    RewriteCond %{HTTP_USER_AGENT} ^CopyRightCheck [OR]
    RewriteCond %{HTTP_USER_AGENT} ^cosmos [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Crescent [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Custo [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DISCo [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DittoSpyder [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Download\ Demon [OR]
    RewriteCond %{HTTP_USER_AGENT} ^eCatch [OR]
    RewriteCond %{HTTP_USER_AGENT} ^EirGrabber [OR]
    RewriteCond %{HTTP_USER_AGENT} ^EmailCollector [OR]
    RewriteCond %{HTTP_USER_AGENT} ^EmailSiphon [OR]
    RewriteCond %{HTTP_USER_AGENT} ^EmailWolf [OR]
    RewriteCond %{HTTP_USER_AGENT} ^EroCrawler [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Express\ WebPictures [OR]
    RewriteCond %{HTTP_USER_AGENT} ^ExtractorPro [OR]
    RewriteCond %{HTTP_USER_AGENT} ^EyeNetIE [OR]
    RewriteCond %{HTTP_USER_AGENT} ^FlashGet [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Foobot [OR]
    RewriteCond %{HTTP_USER_AGENT} ^FrontPage [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^GetRight [OR]
    RewriteCond %{HTTP_USER_AGENT} ^GetWeb! [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Go-Ahead-Got-It [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Googlebot-Image [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Go!Zilla [OR]
    RewriteCond %{HTTP_USER_AGENT} ^GrabNet [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Grafula [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Harvest [OR]
    RewriteCond %{HTTP_USER_AGENT} ^hloader [OR]
    RewriteCond %{HTTP_USER_AGENT} ^HMView [OR]
    RewriteCond %{HTTP_USER_AGENT} ^httplib [OR]
    RewriteCond %{HTTP_USER_AGENT} ^HTTrack [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^humanlinks [OR]
    RewriteCond %{HTTP_USER_AGENT} ^ia_archiver [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Image\ Stripper [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Image\ Sucker [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Indy\ Library [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^InfoNaviRobot [OR]
    RewriteCond %{HTTP_USER_AGENT} ^InterGET [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Internet\ Ninja [OR]
    RewriteCond %{HTTP_USER_AGENT} ^JennyBot [OR]
    RewriteCond %{HTTP_USER_AGENT} ^JetCar [OR]
    RewriteCond %{HTTP_USER_AGENT} ^JOC\ Web\ Spider [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Kenjin.Spider [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Keyword.Density [OR]
    RewriteCond %{HTTP_USER_AGENT} ^larbin [OR]
    RewriteCond %{HTTP_USER_AGENT} ^LeechFTP [OR]
    RewriteCond %{HTTP_USER_AGENT} ^LexiBot [OR]
    RewriteCond %{HTTP_USER_AGENT} ^libWeb/clsHTTP [OR]
    RewriteCond %{HTTP_USER_AGENT} ^LinkextractorPro [OR]
    RewriteCond %{HTTP_USER_AGENT} ^LinkScan/8.1a.Unix [OR]
    RewriteCond %{HTTP_USER_AGENT} ^LinkWalker [OR]
    RewriteCond %{HTTP_USER_AGENT} ^lwp-trivial [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Mass\ Downloader [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Mata.Hari [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Microsoft.URL [OR]
    RewriteCond %{HTTP_USER_AGENT} ^MIDown\ tool [OR]
    RewriteCond %{HTTP_USER_AGENT} ^MIIxpc [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Mister.PiX [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Mister\ PiX [OR]
    RewriteCond %{HTTP_USER_AGENT} ^moget [OR]
    #RewriteCond %{HTTP_USER_AGENT} ^Mozilla/2 [OR]
    #RewriteCond %{HTTP_USER_AGENT} ^Mozilla/3.Mozilla/2.01 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Mozilla.*NEWT [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Navroad [OR]
    RewriteCond %{HTTP_USER_AGENT} ^NearSite [OR]
    RewriteCond %{HTTP_USER_AGENT} ^NetAnts [OR]
    RewriteCond %{HTTP_USER_AGENT} ^NetMechanic [OR]
    RewriteCond %{HTTP_USER_AGENT} ^NetSpider [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Net\ Vampire [OR]
    RewriteCond %{HTTP_USER_AGENT} ^NetZIP [OR]
    RewriteCond %{HTTP_USER_AGENT} ^NICErsPRO [OR]
    RewriteCond %{HTTP_USER_AGENT} ^NPBot [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Octopus [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Offline.Explorer [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Offline\ Explorer [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Offline\ Navigator [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Openfind [OR]
    RewriteCond %{HTTP_USER_AGENT} ^PageGrabber [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Papa\ Foto [OR]
    RewriteCond %{HTTP_USER_AGENT} ^pavuk [OR]
    RewriteCond %{HTTP_USER_AGENT} ^pcBrowser [OR]
    RewriteCond %{HTTP_USER_AGENT} ^ProPowerBot/2.14 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^ProWebWalker [OR]
    RewriteCond %{HTTP_USER_AGENT} ^ProWebWalker [OR]
    RewriteCond %{HTTP_USER_AGENT} ^QueryN.Metasearch [OR]
    RewriteCond %{HTTP_USER_AGENT} ^ReGet [OR]
    RewriteCond %{HTTP_USER_AGENT} ^RepoMonkey [OR]
    RewriteCond %{HTTP_USER_AGENT} ^RMA [OR]
    RewriteCond %{HTTP_USER_AGENT} ^SiteSnagger [OR]
    RewriteCond %{HTTP_USER_AGENT} ^SlySearch [OR]
    RewriteCond %{HTTP_USER_AGENT} ^SmartDownload [OR]
    RewriteCond %{HTTP_USER_AGENT} ^SpankBot [OR]
    RewriteCond %{HTTP_USER_AGENT} ^spanner [OR]
    RewriteCond %{HTTP_USER_AGENT} ^SuperBot [OR]
    RewriteCond %{HTTP_USER_AGENT} ^SuperHTTP [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Surfbot [OR]
    RewriteCond %{HTTP_USER_AGENT} ^suzuran [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Szukacz/1.4 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^tAkeOut [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Teleport [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Teleport\ Pro [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Telesoft [OR]
    RewriteCond %{HTTP_USER_AGENT} ^The.Intraformant [OR]
    RewriteCond %{HTTP_USER_AGENT} ^TheNomad [OR]
    RewriteCond %{HTTP_USER_AGENT} ^TightTwatBot [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Titan [OR]
    RewriteCond %{HTTP_USER_AGENT} ^toCrawl/UrlDispatcher [OR]
    RewriteCond %{HTTP_USER_AGENT} ^toCrawl/UrlDispatcher [OR]
    RewriteCond %{HTTP_USER_AGENT} ^True_Robot [OR]
    RewriteCond %{HTTP_USER_AGENT} ^turingos [OR]
    RewriteCond %{HTTP_USER_AGENT} ^TurnitinBot/1.5 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^URLy.Warning [OR]
    RewriteCond %{HTTP_USER_AGENT} ^VCI [OR]
    RewriteCond %{HTTP_USER_AGENT} ^VoidEYE [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebAuto [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebBandit [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebCopier [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebEMailExtrac.* [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebEnhancer [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebFetch [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebGo\ IS [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Web.Image.Collector [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Web\ Image\ Collector [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebLeacher [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebmasterWorldForumBot [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebReaper [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebSauger [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Website\ eXtractor [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Website.Quester [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Website\ Quester [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Webster.Pro [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebStripper [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Web\ Sucker [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebWhacker [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebZip [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Widow [OR]
    RewriteCond %{HTTP_USER_AGENT} ^[Ww]eb[Bb]andit [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WWW-Collector-E [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WWWOFFLE [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Xaldon\ WebSpider [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Xenu's [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Zeus
     
    Storebuilder, Oct 17, 2005 IP
  6. minstrel

    minstrel Illustrious Member

    Messages:
    15,082
    Likes Received:
    1,243
    Best Answers:
    0
    Trophy Points:
    480
    #6
    I thought I'd already done that. Like many other people, I use Xenu Link Checker to scan for dead links on my main site (100+ pages of categorized links). If I have a link to one of your pages and you block Xenu, I'll get a report from Xenu saying it couldn't access that page. Since I'm not about to individually check every page that yields an error, in most cases I'll simply delete that link from my site.

    That means if you've blocked Xenu you've just lost a backlink to one of your pages.
     
    minstrel, Oct 17, 2005 IP
  7. Storebuilder

    Storebuilder Peon

    Messages:
    13
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #7
    Ok I understand. I've removed it from the .htaccess file. Thanks.
     
    Storebuilder, Oct 17, 2005 IP
  8. Storebuilder

    Storebuilder Peon

    Messages:
    13
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #8
    Thanks for looking.
     
    Storebuilder, Oct 17, 2005 IP
  9. aeiouy

    aeiouy Peon

    Messages:
    2,876
    Likes Received:
    275
    Best Answers:
    0
    Trophy Points:
    0
    #9
    Minstrel,

    On your link checking are you one and done? IE if a link fails you remove it first time?
     
    aeiouy, Oct 17, 2005 IP
  10. minstrel

    minstrel Illustrious Member

    Messages:
    15,082
    Likes Received:
    1,243
    Best Answers:
    0
    Trophy Points:
    480
    #10
    Not always, no. It depends on the site... and on how busy I am that day... and on whether I want to do more exploration.

    If it's a link to a site on a topic that is already well-represented on that page, I'd probably just dump it. If it's more unique, I might go to the trouble of checking it again in a day or two...

    Xenu also tells you what the "error" was, i.e., page not found, access forbidden, request timed out, etc. So it would depend again on how busy I am that day and whether I felt like investigating further and what error eas returned for the link.

    However, on a busy day, for a site that isn't unique, it would be frankly a lot simpler for me to just delete the link. That's my point - if you block Xenu or similar link-checkers, you run the risk of losing back links. Why would you want to take that chance for a request from a benign probe?

    I don't recommend EVER using one of those one-size fits all htaccess blockers. I'm amazed at some of the things I see on those files - like SE spiders. At the very least, check each entry and decide for yourself whether it's something YOU want to block.

    I'm especially surprised that Xenu is blocked by default for phpNuke - I've never used phpNuke but that seems to me to be a bad idea.
     
    minstrel, Oct 17, 2005 IP