Hi, I need some quick help. I have a variable that has the html of a given page. Now I want to remove all http: from the src attribute from images, stylesheets and javascript files. The difficulty I have is that there are loads of ways people could have written the code. Here are some ways that should all work 1) src="http://......" 2) src='http://......" 3) SRC = "http://......" Code (markup): So it should be case insensitive, allow whitespaces between src and = and also between = and " and finally allow " and ' thanks for any help you can give
if all you want to do is remove the http:// - part, and nothing else, all you have to do is str_replace('http://','',$source); Code (markup):
true but the problem with that would be if someone has some text that provides a link it <a href="http://diashfihfih.com">http://diashfihfih.com</a> Code (markup): . In this case I need to keep the http:// in the anchor text
thanks I customised it a little and this is what I ended up with $buffer = preg_replace('~src\s*=\s*["\']\s*http:(.*?)["\']~i', 'src="$1"', $buffer); PHP:
I'm not entirely sure why you need to keep the http:// in the anchor text? If they click on it, it will lead to the page. If they copy the link, it'll work perfectly without the http (unless they copy it into say a FB-chat). And as long as the link is visually different from the rest of the text, it will be obvious that it is a link. But, of course, I do admit that that the preg_replace versions works perfectly as well
simply because I am looking for a solution that does not alter the screen display in any form but only the src's.
I'd suggest expanding that to then see if once you pull that the URI is to the local domain of the page, and then rip that out too... Since you're post-processing anyways might as well gut out that pointless bloat as well.