In this CMS, Each user can have access to insert html to pages. But I can restrict each tag that pose a security threat. Aside from < script > what other tags should I take out?
There's a lot of tags. Leaving a <noscript>, <textarea> or <style> tag open (these are a few of many) will remove the rest of the page. If I remember correctly, you can also have <img onload="javascript:badcode;"> - or just use a <body> tag. The options are endless. Jay
<noscript> <input> <big> << If they Use To Much They can mess up the whole page.. <textarea> and Some Marquee Codes Too BTW , Iframes are the Worst =/
You should also clean img tags too. Most tags should be cleaned and that's why allowing html is a bad idea. Better to create bbcode for your site.
Plain <img> tags can be exploited even without JavaScript. For instance, if someone put an image in a forum that looked like this: <img src="/logout.php" /> It would successfully log out anybody who viewed that page. The easiest way to handle this might be to use the Pear BBCode extension and allow your users to use BBCode.
I wouldn't recommend allowing HTML. Allow BBCode if you must and know what you're doing, nothing more. If you really really allow HTML - don't do a blacklist but a whitelist approach. E.g. don't filter out everything you consider harmful but only allow basic tags that you consider harmless. There's just too many ways out there to trigger the execution of potentially malicious JavaScript that you wouldn't even remotely think of, ever. Just a quick example to make my point: did you know the <marquee> tag has an onbounce event handler when it has it's behavior attribute set to alternate? (IE) .. Didn't think so.