file_get_contents and preg

Scorpiono Well-Known Member

Messages:: 1,330

Likes Received:: 35

Best Answers:: 0

Trophy Points:: 120

#1

Any tip why it displays only 1 link instead everyone ?
$content = file_get_contents("http://www.Scorpiono.com");

preg_match("/<a href=(.*?)>(.*)<\/a>/",$content,$matches);
echo $matches[0];
PHP:

Scorpiono, Jul 25, 2008 IP

nfd2005 Well-Known Member

Messages:: 295

Likes Received:: 20

Best Answers:: 0

Trophy Points:: 130

#2

use: preg_match_all

nfd2005, Jul 25, 2008 IP

Scorpiono likes this.

Scorpiono Well-Known Member

Messages:: 1,330

Likes Received:: 35

Best Answers:: 0

Trophy Points:: 120

#3

The regex seems busted, I need to make it stop after </a> - can anyone see any problem?

PS: Thanks nfd2005, worked.

Scorpiono, Jul 25, 2008 IP

nfd2005 Well-Known Member

Messages:: 295

Likes Received:: 20

Best Answers:: 0

Trophy Points:: 130

#4

Are you trying to get the anchor text and the url or just the URL?

nfd2005, Jul 25, 2008 IP

Scorpiono Well-Known Member

Messages:: 1,330

Likes Received:: 35

Best Answers:: 0

Trophy Points:: 120

#5

I'm trying to get all the complete HTML <a href="etc">etc</a> that has a specific "etc" in the href.

I though I can do this by using 2 arrays, but I'm stuck here.. prolly bad code whatsoever.

Got a solution of yourself ? THanks you, green repped for previous help!

<?php

$content = file_get_contents("http://manele.radioinferno.org");

preg_match_all("/(<a href=(.*?>))(.*)<\/a>/",$content,$matches);
preg_match_all("/(<a href=\"(.*)netdrive(.*)<\/a>)/",$matches[0][$i],$ceva);
echo $ceva[0][0];
$size = sizeof($matches[0]);
echo $size;
//for ($i=1;$i<=$size;$i++) {
// preg_match_all("/(<a href=\"(.*)netdrive(.*)<\/a>)/",$matches[0][$i],$ceva);
// echo $ceva[0][$i];
//}
/*
Click to expand...

Scorpiono, Jul 25, 2008 IP

Mozzart Peon

Messages:: 189

Likes Received:: 3

Best Answers:: 0

Trophy Points:: 0

#6

Eep, I'm not in my desktop PC (where the local webserver is right now to test the code) but

It would be something like
<?php

$content = file_get_contents("http://manele.radioinferno.org");

preg_match_all("/<a href=.*?>.*<\/a>/",$content,$matches);
preg_match_all("/(<a href=\"(.*)netdrive(.*)<\/a>)/",$matches[0][$i],$ceva);
echo $ceva[0][0];
$size = sizeof($matches[0]);
echo $size;
//for ($i=1;$i<=$size;$i++) {
// preg_match_all("/(<a href=\"(.*)netdrive(.*)<\/a>)/",$matches[0][$i],$ceva);
// echo $ceva[0][$i];
//}
/* 
PHP:
Okay, the issue is that using () which are alternation class is to "encapsulate" *I don't know if this is the right word in terms of programming but you get the idea* the data made by the set of rules you have set inside it

Instead you will get the href="" separated and the >TEXT HERE</a> *TEXT HERE separated* If i'm correct the urls will appear in a new array and the text will appear in another one. and the rest in another array

Cheers, sorry if this sounds confusing, I'll give it a shot when I get back to my pc

Mozzart, Jul 25, 2008 IP

Scorpiono Well-Known Member

Messages:: 1,330

Likes Received:: 35

Best Answers:: 0

Trophy Points:: 120

#7

Yes theoretical that's what i was looking for, but code is messy.

I have to grab all the URLs that contain www.something.com from www.websitetograbfrom.com

Scorpiono, Jul 25, 2008 IP

Scorpiono Well-Known Member

Messages:: 1,330

Likes Received:: 35

Best Answers:: 0

Trophy Points:: 120

#8

Fixed, re-coded, works! Thank you

Scorpiono, Jul 25, 2008 IP

Scorpiono Well-Known Member

Messages:: 1,330

Likes Received:: 35

Best Answers:: 0

Trophy Points:: 120

#9

Parse error: syntax error, unexpected T_BOOLEAN_AND in D:\public_html\manele.me\crawl.php on line 9

if (preg_match("/netdrive.ws|dump.ro/i",$matches[2][$i]) > 0) && (!preg_match("/sex/i",$matches[0][$i])) {

-------
What am I missing?

Scorpiono, Jul 25, 2008 IP

nico_swd Prominent Member

Messages:: 4,153

Likes Received:: 344

Best Answers:: 18

Trophy Points:: 375

#10

Remove the parenthesis right in front of the second preg_match().

EDIT:

Also, you don't need to check if the returned value is greater than 0. PHP will treats 0 as false and 1 as true. You also need to escape the dots in the domain names with a backslash. Otherwise they mean "any character".

nico_swd, Jul 25, 2008 IP

Scorpiono Well-Known Member

Messages:: 1,330

Likes Received:: 35

Best Answers:: 0

Trophy Points:: 120

#11

Oh, yeah true nico, ty

Scorpiono, Jul 26, 2008 IP

Scorpiono Well-Known Member

Messages:: 1,330

Likes Received:: 35

Best Answers:: 0

Trophy Points:: 120

#12

Parse error: syntax error, unexpected T_BOOLEAN_AND in D:\public_html\manele.me\crawl.php on line 9

if (preg_match("/netdrive\.ws|dump\.ro/i",$matches[2][$i]) > 0) && !preg_match("/sex/i",$matches[0][$i]) {
Click to expand...

Tip please?

Scorpiono, Jul 26, 2008 IP

nico_swd Prominent Member

Messages:: 4,153

Likes Received:: 344

Best Answers:: 18

Trophy Points:: 375

#13

Check the parenthesis.

You have to close every parenthesis you open. Try to figure out which parenthesis belongs to which and close them all.

nico_swd, Jul 26, 2008 IP

Scorpiono Well-Known Member

Messages:: 1,330

Likes Received:: 35

Best Answers:: 0

Trophy Points:: 120

#14

Honestly can't find them and I'm feeling really dumb right now asking you kindly to bold it?

Scorpiono, Jul 26, 2008 IP

Scorpiono Well-Known Member

Messages:: 1,330

Likes Received:: 35

Best Answers:: 0

Trophy Points:: 120

#15

Fixed, ( ) at the front and back of course.

Ty again

Scorpiono, Jul 26, 2008 IP

Mozzart Peon

Messages:: 189

Likes Received:: 3

Best Answers:: 0

Trophy Points:: 0

#16

Try


if (preg_match("/netdrive\.ws|dump\.ro/i",$matches[2][$i]) && !preg_match("/sex/i",$matches[0][$i])) {

PHP:

Mozzart, Jul 26, 2008 IP

Scorpiono Well-Known Member

Messages:: 1,330

Likes Received:: 35

Best Answers:: 0

Trophy Points:: 120

#17

$matches[0][$i] = preg_replace("/\n+/s", "", $matches[0][$i]);

I'm trying to remove all the blank spaces, this regex doesn't seem to work, any tips please?

Scorpiono, Jul 26, 2008 IP

Log in or Sign up

file_get_contents and preg_match

Scorpiono Well-Known Member

nfd2005 Well-Known Member

Scorpiono Well-Known Member

nfd2005 Well-Known Member

Scorpiono Well-Known Member

Mozzart Peon

Scorpiono Well-Known Member

Scorpiono Well-Known Member

Scorpiono Well-Known Member

nico_swd Prominent Member

Scorpiono Well-Known Member

Scorpiono Well-Known Member

nico_swd Prominent Member

Scorpiono Well-Known Member

Scorpiono Well-Known Member

Mozzart Peon

Scorpiono Well-Known Member

Log in or Sign up

file_get_contents and preg_match

Scorpiono Well-Known Member

nfd2005 Well-Known Member

Scorpiono Well-Known Member

nfd2005 Well-Known Member

Scorpiono Well-Known Member

Mozzart Peon

Scorpiono Well-Known Member

Scorpiono Well-Known Member

Scorpiono Well-Known Member

nico_swd Prominent Member

Scorpiono Well-Known Member

Scorpiono Well-Known Member

nico_swd Prominent Member

Scorpiono Well-Known Member

Scorpiono Well-Known Member

Mozzart Peon

Scorpiono Well-Known Member

Useful Searches