Any tip why it displays only 1 link instead everyone ? $content = file_get_contents("http://www.Scorpiono.com"); preg_match("/<a href=(.*?)>(.*)<\/a>/",$content,$matches); echo $matches[0]; PHP:
The regex seems busted, I need to make it stop after </a> - can anyone see any problem? PS: Thanks nfd2005, worked.
I'm trying to get all the complete HTML <a href="etc">etc</a> that has a specific "etc" in the href. I though I can do this by using 2 arrays, but I'm stuck here.. prolly bad code whatsoever. Got a solution of yourself ? THanks you, green repped for previous help!
Eep, I'm not in my desktop PC (where the local webserver is right now to test the code) but It would be something like <?php $content = file_get_contents("http://manele.radioinferno.org"); preg_match_all("/<a href=.*?>.*<\/a>/",$content,$matches); preg_match_all("/(<a href=\"(.*)netdrive(.*)<\/a>)/",$matches[0][$i],$ceva); echo $ceva[0][0]; $size = sizeof($matches[0]); echo $size; //for ($i=1;$i<=$size;$i++) { // preg_match_all("/(<a href=\"(.*)netdrive(.*)<\/a>)/",$matches[0][$i],$ceva); // echo $ceva[0][$i]; //} /* PHP: Okay, the issue is that using () which are alternation class is to "encapsulate" *I don't know if this is the right word in terms of programming but you get the idea* the data made by the set of rules you have set inside it Instead you will get the href="" separated and the >TEXT HERE</a> *TEXT HERE separated* If i'm correct the urls will appear in a new array and the text will appear in another one. and the rest in another array Cheers, sorry if this sounds confusing, I'll give it a shot when I get back to my pc
Yes theoretical that's what i was looking for, but code is messy. I have to grab all the URLs that contain www.something.com from www.websitetograbfrom.com
Parse error: syntax error, unexpected T_BOOLEAN_AND in D:\public_html\manele.me\crawl.php on line 9 if (preg_match("/netdrive.ws|dump.ro/i",$matches[2][$i]) > 0) && (!preg_match("/sex/i",$matches[0][$i])) { ------- What am I missing?
Remove the parenthesis right in front of the second preg_match(). EDIT: Also, you don't need to check if the returned value is greater than 0. PHP will treats 0 as false and 1 as true. You also need to escape the dots in the domain names with a backslash. Otherwise they mean "any character".
Check the parenthesis. You have to close every parenthesis you open. Try to figure out which parenthesis belongs to which and close them all.
Try if (preg_match("/netdrive\.ws|dump\.ro/i",$matches[2][$i]) && !preg_match("/sex/i",$matches[0][$i])) { PHP:
$matches[0][$i] = preg_replace("/\n+/s", "", $matches[0][$i]); I'm trying to remove all the blank spaces, this regex doesn't seem to work, any tips please?