So how often do other directory owners remove dead URLS? How frequently do you clean up both dead or re-directed URLS? I've been doing it pretty much daily lately. I would say it equates to about 75 per week in one of my directories. Seems it I spend as much time keeping things clean as I do reviewing new sites. Others?
With the US economy as it is we are loosing a lot of websites every day. Keeping up with the removals is becoming a bit of a chore. I do mine every day on all directories and hate to see so many websites going down. Some have been with me since 2001.
I've seen a lot disappear, but then again, I've still got a steady amount listing... Seems turn over is a bit higher lately though.. Yes, but I still manually review them. The automatic script does not catch the ones that go down, then magically re-appear as registrar landing pages. Or just zero content pages, or sites that change content and change hands. Our script will check the site several times, and if it does become unreachable it is not deleted immediately. There are several checks to insure there are no false positives. Site owners are also given time to correct whatever problem they may be having that got the site temporarily delisted... What I am talking about is the final phase in the process where we manually review the site to make sure 100% that it is dead, gone, whatever... Hope that explains it for you.
At my directory list, a script weekly checks the URLs (searching for 404 pages, bandwidth exceeded, parked domains, forbidden accesss, hacked, etc) and temporarily 'mothballs' these listings. Each directory listing has a counter where the quantity of recurring fails is saved. After a good few recurrent fails the listing is manually deleted. Another script looks for updates, if this script find updates on a mothballed listing, the listing will be automatically turned to active, reseting its counter to 0). As mothballed listings aren't displayed on my directory, I keep them until they are 'Google de-indexed'. Parked domains are deleted immediately.
Howewer, as time passes many high quality websites are sold, or simply they die for any reason. Also, many well maintained websites are temporarily down and here quality makes no difference. When it happens, try to keep away 'out of service' websites (temporarily) is advisable.
Because he actually has a good directory where he cares about the quality of the links, not PR and other stuff.
I've been wondering that as well. A lot of surprisingly big names are being affected by this crisis. I think the finance, construction and property categories are due for more frequent scrutiny.
But that is not going to help you find sites that may be dead, but still appear active because they were either sold, moved or parked. You're still going to have dead sites no matter what the quality. I guess it just depends on the volume. This is pretty much exactly what we do. But as you said, the recurrent failures are manually deleted. There are way too many chances for false positives. Bingo! Even the best of sites will die off for any number of reasons. Exactly!
I'd be interested in knowing that. We coded our own to do this. Its pretty involved and I'd not like to repeat it if I do not have to. I bought a directory based on phpLD and would like to know as well.
I do mine with a script... find all the 404's, 302's and 301's as well as the parked pages. They get de-listed on the first failure and automatically dumped on the fifth. And no sorry, it's not for phpLD.
ok what I do it run the validator as my sites at this time are phpld then I go thru all the links that it says is not active and this way the ones that are active I keep the oness who are dead or have changed to spam pages I delete