extract links from text file using php

linkinpark2014 Peon

Messages:: 153

Likes Received:: 1

Best Answers:: 0

Trophy Points:: 0

#1

guys how just want to know how do i capture the data on a text file and save it into an array with some filtering

here is an example:
I have text file contains links in this form:
<a href="http://blablabal.com">blah</a>
<a href="http://blablabal1.com">blah1</a>
<a href="http://blablabal2.com">blah2</a>
<a href="http://blablabal3.com">blah3</a>
Code (markup):
I want to read only the links without "<a href=>blah</a>"
and save them inside an array...any ideas?

linkinpark2014, Jul 15, 2008 IP

Danltn Well-Known Member

Messages:: 679

Likes Received:: 36

Best Answers:: 0

Trophy Points:: 120

#2

A regex along the lines of...

/<a href="(?<url>.+)?">(?<name>.+)<\/a>/

Think that's about right, definitely the general gist for you.

Dan

Danltn, Jul 15, 2008 IP

myhart Peon

Messages:: 228

Likes Received:: 11

Best Answers:: 0

Trophy Points:: 0

#3

You might check out this post which I made for a similar request. Hope it's close to what you are looking for!

http://forums.digitalpoint.com/showthread.php?p=8072708#post8072708

myhart, Jul 15, 2008 IP

linkinpark2014 Peon

Messages:: 153

Likes Received:: 1

Best Answers:: 0

Trophy Points:: 0

#4

hi guyz.. both ways give me same results..
I only want links without <a href></a> tags:
<a href="http://blablabal.com">blah</a>
Code (markup):
I want to get rid of <a href=" and ">blah</a> tags and get only the link
"http://blablabal.com"..
Code (markup):
I tried every single possible way and till now I didnt get any good result...

I tried to extract links in this way
if(preg_match_all('/<a\s+.*?href=[\"\']?([^\"\' >]*)[\"\']?[^>]*>(Play)<\/a>/i',$result,$out, PREG_SET_ORDER))
Code (markup):
I get only 3 arrays which are:
1-<a href="http://blablabal.com">blah</a>
2-http://blablabal.com
3-blah
Code (markup):
now Im getting the
http://blablabla.com
Code (markup):
and it looks good for now..which in array no2
the problem it gives me only the result for 1 link
I wanna get results for all links inside that text file...any ideas?

linkinpark2014, Jul 16, 2008 IP

sastro Well-Known Member

Messages:: 214

Likes Received:: 2

Best Answers:: 0

Trophy Points:: 105

#5

<?
$file = file_get_contents('http://localhost/tes/x.html');
preg_match_all('/<a href=\"(.*)\"\s/',$file,$a);

$count = count($a[1]);
echo "Number of Urls = " .$count."";
for ($row = 0; $row < $count ; $row++) {
echo $a[1]["$row"]." ";
}

?>

sastro, Jul 16, 2008 IP

linkinpark2014 Peon

Messages:: 153

Likes Received:: 1

Best Answers:: 0

Trophy Points:: 0

#6

okay thanx all for ur help...
I just changed regex pattern and everything works great!

(preg_match_all('/<a\s+.*?href=[\"\']?([^\"\' >]*)[\"\']?[^>]*>(Play)<\/a>/i',$file,$a))

=================================================================

<?
$file = file_get_contents('http://localhost/tes/x.html');
preg_match_all('/<a\s+.*?href=[\"\']?([^\"\' >]*)[\"\']?[^>]*>(Play)<\/a>/i',$file,$a);

$count = count($a[1]);
echo "Number of Urls = " .$count."";
for ($row = 0; $row < $count ; $row++) {
echo $a[1]["$row"]." ";
}

?>
Click to expand...

linkinpark2014, Jul 16, 2008 IP

Log in or Sign up

extract links from text file using php

linkinpark2014 Peon

Danltn Well-Known Member

myhart Peon

linkinpark2014 Peon

sastro Well-Known Member

linkinpark2014 Peon

Useful Searches