Hi folks, need some advice to match and extract images <img> alt= and src= tags. The source text is this First approach is extracting the scr= portion that is working fine: preg_match_all("/\< *[img][^\>]*src *= *[\"\']{0,1}([^\"\'\ >]*)/", $html, $matches); PHP: Question, I also need the matching alt= tag I tried the following that didn't the job
I did some advance with the previous code however still need to improve the code. Here is again the text <p align="center"><a title="Gato con carrito de compra de supermercado invisible" class="imagelink" rel="attachment" id="p203" href="http://www.ecnc.com/blog/2007/07/23/voy-al-supermercado-me-compras-unas-toallas-femeninas-mi-amor/gato-con-carrito-de-compra-de-supermercado-invisible/"> <img alt="Gato con carrito de compra de supermercado invisible" id="image203" src="http://www.lecnc.com/blog/wp-content/uploads/2007/07/carrito_de_compra_invisible.jpg" /></a></p> <p>Fuente: <a target="_blank" title="Funny animals" href="http://www.flickr.com/photos/funny_animals/380169474/">Flickr Funny Animals </a></p> Code (markup): I'm trying to match alt= and scr= Regular expression (regex) code I'm using /\< *[img][^\>]*alt *= *[\"\']{0,1}([^>]*) *src *= *[\"\']{0,1}([^\"\'\ >]*)/ Code (markup): It's matching fine, however match 1 is matching more that I want, the alt= is getting Gato con carrito de compra de supermercado invisible" id="image203" Code (markup): Now I need to take out the id="image203" from the selection Any suggestion is well appreciated?
First i would try your last expression, but put a capital 'U' after your regex delimiter: /\< *[^\>]*alt *= *[\"\']{0,1}([^>]*) *src *= *[\"\']{0,1}([^\"\'\ >]*)/[b]U[/b] This tells it to be ungreedy, so it might not get too much data then. However, since your img tag attributes are not always in the same order, I would do this in a couple steps instead of all at once. It is a little slower, but it will be more reliable and easier to maintain in the future. First I would match all the image tags, loop through the results and match the different fields: [CODE] preg_match_all("#<img (.*) [ /]{0,1}>#Ui", $html, $matches_images); foreach($matches_images[1] as $id => $attributes) { preg_match('#src=[\'|"](.*.)[\'|"]#Ui', $attributes, $match_srcs); preg_match('#alt=[\'|"](.*.)[\'|"]#Ui', $attributes, $match_alts); $src = $match_srcs[1]; $alt = $match_alts[1]; } [/CODE] -the mole