I'm writing an Html parser to get information from web pages, but I haven't found a fast&decent parser. Any language is ok. Right now I'm testing PHP and Java. Any Idea?
What are you trying to get from the web pages? I think this is close to what you are looking for: http://www.phpclasses.org/browse/package/1754.html Code (markup): Look around that site for plenty more related classes, I'm sure one will do the job.
Again that site. I need to subscribe then. You can see an example of what I want to do on my blog. The Technorati Tool you can see on the top menu of my blog was made by hand. I'd like to have some more general purpose stuff to do that kind of things. Just imagine a tree where you can look for specific ids.. my blog
I'm testing JTidy right now. I wonder if there is a speed comparison somewhere on the net. Also a simple PHP parser would be helpful for my blog because I'd like to explain to my reader what I can do with a parser, but I don't want to teach Java.
I used internet explorer COM object for parsing. There are problems with performance, but as this COM object as part of browser, most decent tool for parsing I found.