Heres my thoughts on the similarity checker, it can work, but only if your editors are prepared to check the sites flagged by the program. So this is a lot of added work. A better similarity checker could be one that crawls a page or multiple pages to find out topics, and checks that to see if it matches the category and title text of the submission. Comparing to see if the site has changed does not work.. Afterall should you penalize a site that goes through a complete restructure? One of my sites is about to be get a whole new face and new content this week. Its better looking, and has way more to it then the old one did. So basically definatly not auto deleting, auto flagging could be good. an0n i agree with you in part about suggest cats in the submit form generating spam. However it is the job of the script to prevent as much spam as possible while creating a better experience for both editors and users. dkessaris added in some good ideas which i'd say are important. I'd like some of my categories to be cheaper then others, and perhaps i'd like to allow the first 2 submissions to be free.. This would be nice. One of the keys to adding all these features would be a very easy to use admin system which is intuitive. Also installation would need to be very simple to use. dvduval really hit some good points that optimizing code for security and speed is essential. Its the difference between a mature script and a first edition. Here are a couple add-ons i have in mind: Page Strength type thing that looks at many of the same factors as the page strength tool at seomoz. Title in Description - You submit a title, and a description, however the description must contain one instance of the title. This will be the link. Something similar to ask-dir.com Easy Mod Panel - A mod panel that allows you to easily install, remove, and modify mods.
And with regards to David speaking about upping security, I've recently come across a few things i was not happy with seeing in a few directories. Anywho, long story short, those types of submissions won't be getting through anymore, and should I find any other types of quirky submissions, those will be readily dealt with and taken care of as well. Boby is and always will be on the up and up to secure things when needed.
Of course the sites need to be reviewed by a human. And, IMO, it's something that every directory should be doing anyway (validating that existing listings have not pulled a bait & switch or changed significantly). The checker just helps narrow the pool. I don't agree here. I don't trust semantic algorithms to do my editing work for me. I think you missed the point. The point is to identify sites that need to be re-reviewed. They could be just fine, or they may need to be re-categorized or deleted. Page Strength - I would not use it, but then, I don't sort listings by PR either. Title in Description - sounds like you'd like to be able to embed links in the description. I think it's a great idea, but should be expanded to allow for multiple (deep) links too. Easy Mod Panel - vB sets the standard as far as I'm concerned.
It was vaguely mentioned today, and will require an extensive amount of time and effort to accomplish. Who knows what the future will bring though! phpLD has been setting and raising the bar for quite some time now, and I don't see things slowing down anytime soon.
Apexoo does something like this - it crawls to check that the metas of a site contain the same keyword as the category submit. It's sometimes tedious when you can't get a site in a precise enough category, but I can certainly see its value.
I think you misread what i said or perhaps i wrote it badly.. First off i said when a similarity checker finds a problem it would flag a site then a reviewer would need to come by. This isn't to say that a review doesn't need to take place at first. Who said its doing your editing work? This is the same idea as a similarity checker just a different principle then looking for big changes in code or the words on a page. That is what i said i think... I did page strength i said your own page strength like thing, so this would be a conglomeration of different factors created for yourself. Yeah that could be a good addition, to it.
You don't need to, we've implimented this in phpLynx which is due for release very shortly. As for someone with big ideas as well as modest ones, don't waste them, come over and share them with us they will be listened to that's for certain, and, if viable we will impliment them. phpLynx - The ShareSource Script
Rob made a good point, that's why phpLynx implimented BOTH 'suggest a category' together alongside the submission, but protected with a TEXT Captcha that is not very easy for a human to beat let alone any bots or directory submission experts could beat. If the directory owner is diligent he has the ability to make/alter any number of questions for the captcha which in effect can make it as difficult as winning the lottery getting the same answer correct more than once or twice.
^^^ If the questions are done right, a 5 year old can answer the questions correctly, but a bot won't be able to solve it.
That sounds ominous, with regard to the trouble a lot of people are having getting the image captcha right. It isn't supposed to tax your intelligence, unless you're running a niche directory for mensa members! Writing a good text captcha is quite an art.
What, like how many legs has a spider got? Is the moon red or yellow? What is a car supposed to do at a red light? Sorry for it to sound ominous Obelia, but what I was getting that it depends on the directory owner, the text captcha we have in place can make it kindergarten level or Mensa's top 100 Club! the point being is that no matter what question you put in you have to type in the answer regardless, and that's vital IMHO. Oh, we also give you the option to use graphics, so you can make it a little more easier, like is this a picture of an apple? Yes or No!
How good are screen readers at this point? Can they even defeat letters that are rotated just a bit? How about an arrow pointed to the captcha on the other side of the screen? OR how about a captcha like the following.. 234234 What is red? [ ]
As long as you accept both 8 and eight as the correct answer. Depends on atmospheric conditions. It's more of a pale grey, but sometimes it's bright orange. Okay, I give you that one, it's international. My point is just that you have to find easy questions that allow for no ambiguity with the answer, and it's easy to go wrong on that point. One of the problems you might encounter if you put in a captcha with all the question/answer combinations prefilled, is that nobody bothers to modify it. That is worse than useless, because it will make it simple to exploit. One of the things I have noticed whilst submitting recently is the popularity of a text captcha that takes this form "How do you spell "John" backwards?". There is no point in a trivia captcha if everybody uses the exact same version. One thing that might be worth considering is removing all of your prefilled questions, and just leaving guidelines for good captcha question writing. Using graphics negates one of the benefits of the text captcha, of it being accessible to blind and partially sighted users. And anything with a yes/no answer will be wide open to spam.
And Eight Our script takes into account up to five variants of answer, or one if that's all you want. Agreed, and you are the master of this as the directory owner and question setter, the point I was making was that from the simplest to the most complex of questions you simply HAVE TO TYPE an answer. Good point, this is why ours is 100% editible from the ACP, so it will NEVER be the same version as the script is provided with a naked database, if it gets beaten it will be due to the complacancy of the directory owner but not the script. Repeating the above, there are no pre-filled questions, this is up to the individual directory owner. We really have to tried to cover all bases here. Not necessarily, if one is blind they would have problems with ANY captcha so I think your stretching it a bit here, but it is food for thought. Nothing is impossible, only improbable!! Wait until it comes out in a couple of days and you'll see what I mean.
Obelia, the system works similarly to the phpbb anti-bot mod that I referenced earlier. You can define a set of questions and any given submission is presented with a question from the set randomly. Bot owners will have a hard time solving them programmatically or keeping an up to date database of current questions (because the admin can change them as soon as s/he suspects a bot is targeting the current set). When you consider multiple directories using the same system, it would be impractical for anyone to maintain an effective bot. I do agree with you that good instructions would be an excellent idea.
The funniest part of all this CAPCHA stuff is that the major submitter(s) to directories won't be submitting to your director(y/ies) if you have it turned on anyway. el oh el