extracting links from an array

scriptreseller Active Member

Messages:: 323

Likes Received:: 15

Best Answers:: 0

Trophy Points:: 58

#1

Hi

I am pulling my hair with this been 4 days on what i started thing it would taker a few hours lol.

Write i have a text file which has the body of an email in it. The email has links in it i need to get them links out of the email.

So far i have manage to but the email in to an array use
$url= explode('http://',$content);

split all so works 

$url= split('http://',$content);
Code (markup):
Then out put it on to the screen the like this
$num = 0;
foreach($url as $ScreenURL)
{
    $num++;
    echo "<b>($num)</b> '", ($ScreenURL), "'\n<br /><br />\n\n";
}
Code (markup):
which dose give me the urls but it all so gives me a lot of other stuf to i need to now how to just get get the urls.

I am just learning php and this is killing me

If any one can help me i would be very great full and so will my hair lol

Thanks
Daz

scriptreseller, Jan 5, 2009 IP

cont911 Peon

Messages:: 50

Likes Received:: 4

Best Answers:: 0

Trophy Points:: 0

#2

use something line this
<?php
$content = "lksdjf lskdjfsdf sdf http://www.php.net/ klsjdf skldfj http://abc.com skdlfjs";
preg_match_all("/(http:\/\/[^ ]+)/i", $content, $matches);

var_dump($matches);

?>

cont911, Jan 5, 2009 IP

scriptreseller Active Member

Messages:: 323

Likes Received:: 15

Best Answers:: 0

Trophy Points:: 58

#3

Thanks that really helped me out i have changed it a bit as it gave me a lump of data which i did not now what to do with lol

here is how it looks now

preg_match_all("/<a[\s]+[^>]*href\s*=\s*[\"\']?([^\'\" >]+)[\'\" >]/i", $content, $matches);
 


echo '<pre>'; // This is for correct handling of newlines
ob_start();
var_dump($matches);
$a=ob_get_contents();
ob_end_clean();
echo htmlspecialchars($a); // Escape every HTML special chars (especially > and < )
echo '</pre>';

Code (markup):

which gives me

array(2) {
  [0]=>
  array(4) {
    [0]=>
    string(40) "<a href="mailto:XXXXXXX@googlemail.com""
    [1]=>
    string(36) "<a href="mailto:XXXXXXX@XXXXXXX.co.uk""
    [2]=>
    string(35) "<a href="http://www.XXXXXXX.com""
    [3]=>
    string(31) "<a href="http://www.XXXXXXX.eu""
  }
  [1]=>
  array(4) {
    [0]=>
    string(30) "mailto:XXXXXXX@googlemail.com"
    [1]=>
    string(26) "mailto:XXXXXXX@XXXXX.co.uk"
    [2]=>
    string(25) "http://www.XXXXXXX.com"
    [3]=>
    string(21) "http://www.XXXXXXX.eu"
  }

Code (markup):

how do i get just the urls out of that now so i put them in a database or a text file

Thanks

scriptreseller, Jan 5, 2009 IP

scriptreseller Active Member

Messages:: 323

Likes Received:: 15

Best Answers:: 0

Trophy Points:: 58

#4

Ok this is proberly not the best way to do it and there is proberly a better way to do it lol feel free to let me know if there is but i have got it so i only have the url i did it by duplicating the code but adding a new regex


preg_match_all("/<a[\s]+[^>]*href\s*=\s*[\"\']?([^\'\" >]+)[\'\" >]/i", $html, $matches);
 

echo '<pre>'; // This is for correct handling of newlines
ob_start();
var_dump($matches);
$a=ob_get_contents();
ob_end_clean();
preg_match_all("/(http:\/\/[^ ]+)/i", $a, $matches1);
echo htmlspecialchars($a); // Escape every HTML special chars (especially > and < )
echo '</pre>';



echo '<pre>'; // This is for correct handling of newlines
ob_start();
var_dump($matches1);
$a1=ob_get_contents();
ob_end_clean();
echo htmlspecialchars($a1);// Escape every HTML special chars (especially > and < )
echo '</pre>'

Code (markup):

which now gives me

array(2) {
  [0]=>
  array(4) {
    [0]=>
    string(28) "http://www.url1.com""
"
    [1]=>
    string(24) "http://www.url2.eu""
"
    [2]=>
    string(27) "http://www.url1.com"
"
    [3]=>
    string(23) "http://www.url2.eu"
"
  }
  [1]=>
  array(4) {
    [0]=>
    string(28) "http://www.url1.com""
"
    [1]=>
    string(24) "http://www.url2.eu""
"
    [2]=>
    string(27) "http://www.url1.com"
"
    [3]=>
    string(23) "http://www.url2.eu"
"
  }
}

Code (markup):

How do is get the url to put it in a database with out the string(23) stuff just the url lol

Thanks

scriptreseller, Jan 5, 2009 IP

Danltn Well-Known Member

Messages:: 679

Likes Received:: 36

Best Answers:: 0

Trophy Points:: 120

#5

Just loop through the array instead of var_dump'ing it...
for($i = 0, $c = count($matches[1]); $i < $c; ++$i)
    echo $matches[1][$i] . PHP_EOL;
PHP:
Could use while/foreach, whatever you prefer.

Danltn, Jan 5, 2009 IP

scriptreseller Active Member

Messages:: 323

Likes Received:: 15

Best Answers:: 0

Trophy Points:: 58

#6

Thaks that works well the only problem now is when i try to write it to a to a text file its only writing the last url

How can i get it to write them all

Sorry about this just all new to me lol

Thansk

scriptreseller, Jan 5, 2009 IP

Danltn Well-Known Member

Messages:: 679

Likes Received:: 36

Best Answers:: 0

Trophy Points:: 120

#7

file_put_contents('file.txt', implode(PHP_EOL, $matches[1]));

PHP:

Danltn, Jan 5, 2009 IP

scriptreseller likes this.

Log in or Sign up

extracting links from an array

scriptreseller Active Member

cont911 Peon

scriptreseller Active Member

scriptreseller Active Member

Danltn Well-Known Member

scriptreseller Active Member

Danltn Well-Known Member

Useful Searches