Hello. I am using this pattern to match anchor texts of files in a string: $pattern = '#<a href\=\'([^\']*)\'>(.*?)</a>#'; PHP: However, if I have a link which contains a single quote in it's anchor it's not included: <a href='http://google.com'>Google's Website</a> <!--Note the ' after the word Google--> HTML: Note that I can not modify the input string, it is acquired from a remote page (e.g. I can't change href=' ' to href=" "), but I have to make this regular expression work. Any ideas?
Have you tried to take the ^\' out and just have it as ^' instead? Or you could just replace the loaded content like $content = str_replace("'",'"',$content); Just a thought.
Replacing is not an option because it will replace both the single quotes in the href attribute and the anchor text itself.
Again, I don't input the string. It's a block of HTML code which I scrape from a remote page. It is formatted this way: single quotes for the href attribute and single quotes (occasionally) in the anchor. I need a way to find only the anchors.
It seems like the solution was very simple. Adding a regular expression modifier to the end - m. $pattern = '#<a href\=\'([^\']*)\'>(.*?)</a>#im'; // the "i" modifier in the end is for case-insensitive match, "m" is for multiple occurances PHP: This regular expression cheat sheet was of huge help: http://www.cheatography.com/davechild/cheat-sheets/regular-expressions/