1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

Stripping Links with PHP?

Discussion in 'PHP' started by smashedpumpkins, Mar 27, 2010.

  1. #1
    I'm using a WordPress plugin to grab articles related to my website content. The plugin takes the article and copies it identically and posts it to my blog. However, with many articles, it's copping links that I do not want. I'd like to strip all links, but leave the rest of the formatting in tac. I don't think it should be horrible hard to do, but I can't figure it out myself. If it's a simple line or two can anyone help me out? I believe it should be added between Line 121 - 135. (These lines define the content that is displayed. The next 20 lines after this code is a second option that removes all formatting. However I only want to remove links. I really appreciate any help you can offer!

    
    		$xpath = new DOMXPath($dom);
    		$paras = $xpath->query("//div[@id='KonaBody']//p"); 
    
    		for ($i = 0;  $i < $paras->length; $i++ ) {  //$paras->length
    
    			$para = $paras->item($i);
    			$paragraph = $para->textContent;
    			
    			if ($paragraph != '') {
    					if (function_exists('ma_translate') && get_option('ma_trans_article') == 1) {$paragraph = ma_translate($paragraph);}
    			
    				$content .= $paragraph . ' ';
    				$content .= "<br/><br/>";
    			}
    		}
    
    PHP:
    For a better understanding here's the entire files code.
    
    <?php
    
    function ma_articlepost($keyword,$cat,$num,$which) {
       global $wpdb, $ma_dbtable;
       
    	// Debug
       	debug_log('- EZA');	 
    	$keyword2 = $keyword;	
    	$keyword = str_replace( " ","+",$keyword );	
    	$keyword = urlencode($keyword);
    	
    	  $blist[] = "Mozilla/5.0 (compatible; Konqueror/4.0; Microsoft Windows) KHTML/4.0.80 (like Gecko)";
          $blist[] = "Mozilla/5.0 (compatible; Konqueror/3.92; Microsoft Windows) KHTML/3.92.0 (like Gecko)";
          $blist[] = "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0; WOW64; SLCC1; .NET CLR 2.0.50727; .NET CLR 3.0.04506; Media Center PC 5.0; .NET CLR 1.1.4322; Windows-Media-Player/10.00.00.3990; InfoPath.2";
          $blist[] = "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1.1.4322; InfoPath.1; .NET CLR 2.0.50727; .NET CLR 3.0.04506.30; Dealio Deskball 3.0)";
          $blist[] = "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; NeosBrowser; .NET CLR 1.1.4322; .NET CLR 2.0.50727)";
          $ua = $blist[array_rand($blist)];	
    	
    	$source = get_option('ma_eza_source');
    	
       // SOOPERARTICLES	
       if($source == "sooperarticles") {   
       
    	$startat = $num;
    	
    	if ($startat == 0) {
    		$startpage = 1;
    		$sk = 1;
    	} else {
    		$xz = $startat / 15;
    		$startpage = ceil($xz);
    		$sk = $startat - ( $startpage -1 ) * 15;
    	}
    	$l = $startpage;
    	$sk = $sk -1;
     
    	$search_url = "http://www.sooperarticles.com/search/?t=titles&s=$keyword&p=$l";
    	// make the cURL request to $search_url
    	$ch = curl_init();
    	curl_setopt($ch, CURLOPT_USERAGENT, 'Firefox (WindowsXP) - Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.8.1.6) Gecko/20070725 Firefox/2.0.0.6');
    	curl_setopt($ch, CURLOPT_URL,$search_url);
    	curl_setopt($ch, CURLOPT_FAILONERROR, true);
    	curl_setopt($ch, CURLOPT_AUTOREFERER, true);	
    	curl_setopt($ch, CURLOPT_RETURNTRANSFER,true);
    	curl_setopt($ch, CURLOPT_TIMEOUT, 45);
    	$html= curl_exec($ch);
    	if (!$html) {
    		echo "<br />cURL error number:" .curl_errno($ch);
    		echo "<br />cURL error:" . curl_error($ch);
    		exit;
    	}
    	curl_close($ch); 
    
    	// parse the html into a DOMDocument  
    
    		$dom = new DOMDocument();
    		@$dom->loadHTML($html);
    
    	// Grab Product Links  
    
    		$xpath = new DOMXPath($dom);
    		$paras = $xpath->query("//div/h3/a");
    
    		$para = $paras->item($sk);
    		if($para == '' | $para == null) {
    			echo '<div class="updated"><p>No articles found!</p></div>';
    			return "nothing";
    			break;
    		} else {		
    		$target_url = $para->getAttribute('href');
    
     	// make the cURL request to $search_url
    	$ch = curl_init();
    	curl_setopt($ch, CURLOPT_USERAGENT, $ua);
    	curl_setopt($ch, CURLOPT_URL,$target_url);
    	curl_setopt($ch, CURLOPT_FAILONERROR, true);
    	curl_setopt($ch, CURLOPT_AUTOREFERER, true);	
    	curl_setopt($ch, CURLOPT_RETURNTRANSFER,true);
    	curl_setopt($ch, CURLOPT_TIMEOUT, 45);
    	$html= curl_exec($ch);
    	if (!$html) {
    		echo "<br />cURL error number:" .curl_errno($ch);
    		echo "<br />cURL error:" . curl_error($ch);
    		exit;
    	}
    	curl_close($ch);
    	
    	// parse the html into a DOMDocument  
    
    		$dom = new DOMDocument();
    		@$dom->loadHTML($html);
    		
    
    	// Grab Article Title 
    
    		$xpath = new DOMXPath($dom);
    		$paras = $xpath->query("//div/h1");
    		
    		$para = $paras->item(0);
    		$title = $para->textContent;		
    		$title2 = $title;
    		
    		if (function_exists('ma_translate') && get_option('ma_trans_title') == 1 && get_option('ma_trans_article') == 1) {$title = ma_translate($title2);}			
    
    		
    	// Check X
    		
    		$xpath = new DOMXPath($dom);
    		$paras = $xpath->query("//div[@id='KonaBody']/div[@class='arightside']"); 
    		
    		$para = $paras->item(0);
    
    		if($para != "" && $para != null) {
    			return false;
    			break;	
    		}
    		
     	// Grab Article	
    	
    	if (get_option('ma_eza_grabmethod')=='old') {
    		$xpath = new DOMXPath($dom);
    		$paras = $xpath->query("//div[@id='KonaBody']//p"); 
    
    		for ($i = 0;  $i < $paras->length; $i++ ) {  //$paras->length
    
    			$para = $paras->item($i);
    			$paragraph = $para->textContent;
    			
    			if ($paragraph != '') {
    					if (function_exists('ma_translate') && get_option('ma_trans_article') == 1) {$paragraph = ma_translate($paragraph);}
    			
    				$content .= $paragraph . ' ';
    				$content .= "<br/><br/>";
    			}
    		}		
    	} elseif (get_option('ma_eza_grabmethod')=='new') {
    		$xpath = new DOMXPath($dom);
    		$paras = $xpath->query("//div[@id='KonaBody']"); 
    		$para = $paras->item(0);		
    		$string = $dom->saveXml($para);	
    		$tags = array('div','iframe','script');
    		$string = ma_strip_selected_tags($string, $tags);	
    		$string = str_replace("]]>", "", $string);
    		$string = str_replace("<![CDATA[", "", $string);
    		
    		if (function_exists('ma_translate') && get_option('ma_trans_article') == 1) {$string = ma_translate($string);}
    
    		$content .= $string . ' ';
    	}
    	
    	// Grab Ressource Box	
    
    		$xpath = new DOMXPath($dom);
    		$paras = $xpath->query("//div[@class='author-signature']");
    		$para = $paras->item(0);		
    		$ressourcetext = $dom->saveXml($para);	
    		if (function_exists('ma_translate') && get_option('ma_trans_articlebox') == 1) {$ressourcetext = ma_translate($ressourcetext);}
    		
    		if ($ressourcetext != '') {
    		$authorbox = "<div style=\"margin:5px;padding:5px;border:1px solid #c1c1c1;font-size: 10px;\">" . $ressourcetext . "</div>";	
    		}	
     
     	}
    	}  
       
       // ARTICLESBASE
       if($source == "articlesbase") {
       
       /* Select Proxy
    	$proxy ="";
    	$burl = get_bloginfo('url');;
    	$arr=@file("$burl/wp-content/plugins/WPRobot/modules/proxies.txt");
    	if($arr) {
    		$noprox = count($arr) - 1;
    		$rprox = rand(0,$noprox);
    		list($proxy,$proxytype,$proxyuser)=explode("|",$arr[$rprox]);
    	}
    	*/
    
    		$page = $num / 15;
    		$page = (string) $page; 
    		$page = explode(".", $page);	
    		$page=(int)$page[0];	
    		$page++;	
    	
    		if($page == 0) {$page = 1;}
    		$prep = floor($num / 15);
    		$numb = $num - $prep * 15;
    
    	
    	/*
    	$numb = $num;
    	$num = $num / 15;
    	$num = (string) $num; 
    	$num = explode(".", $num);
    	$page=(int)$num[0];	
    	$page++;				
    	$cnum=(int)$num[1]; 
    	$l = $page;
    	$sk = $cnum;*/
    	
    	$lang = get_option('ma_eza_lang');
    	if($lang == "en") {
    		$search_url = "http://www.articlesbase.com/find-articles.php?q=$keyword&page=$page";
    	} elseif($lang == "fr") {
    		$search_url = "http://fr.articlesbase.com/find-articles.php?q=$keyword&page=$page";	
    	} elseif($lang == "es") {
    		$search_url = "http://www.articuloz.com/find-articles.php?q=$keyword&page=$page";
    	} elseif($lang == "pg") {
    		$search_url = "http://www.artigonal.com/find-articles.php?q=$keyword&page=$page";
    	} elseif($lang == "ru") {
    		$search_url = "http://www.rusarticles.com/find-articles.php?q=$keyword&page=$page";
    	}
    
    	// make the cURL request to $search_url
    	$ch = curl_init();
    	curl_setopt($ch, CURLOPT_USERAGENT, 'Firefox (WindowsXP) - Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.8.1.6) Gecko/20070725 Firefox/2.0.0.6');
    	curl_setopt($ch, CURLOPT_URL,$search_url);
    	curl_setopt($ch, CURLOPT_FAILONERROR, true);
    	curl_setopt($ch, CURLOPT_AUTOREFERER, true);
    	/*
    Proxy
    	if($proxy != "") {
    	curl_setopt($ch, CURLOPT_HTTPPROXYTUNNEL, 1); 
    	curl_setopt($ch, CURLOPT_PROXY, $proxy);
    	if($proxyuser) {curl_setopt($ch, CURLOPT_PROXYUSERPWD, $proxyuser);}
    	if($proxytype == "socks") {curl_setopt ($ch, CURLOPT_PROXYTYPE, CURLPROXY_SOCKS5);}
    	}
    */
    	curl_setopt($ch, CURLOPT_RETURNTRANSFER,true);
    	curl_setopt($ch, CURLOPT_TIMEOUT, 45);
    	$html= curl_exec($ch);
    	if (!$html) {
    		echo "<br />cURL error number:" .curl_errno($ch);
    		echo "<br />cURL error:" . curl_error($ch);
    	}
    	curl_close($ch); 	
    	//$html = file_get_contents($search_url);
    
    	// parse the html into a DOMDocument  
    
    		$dom = new DOMDocument();
    		@$dom->loadHTML($html);
    
    	// Grab Product Links  
    
    		$xpath = new DOMXPath($dom);
    		$paras = $xpath->query("//div//h3/a");
    
    		$para = $paras->item($numb);
    		if($para == '' | $para == null) {
    			//echo '<div class="updated"><p>No articles found!</p></div>';
    			return "nothing";
    			break;
    		} else {		
    	if($lang == "en") {
    		$target_url = $para->getAttribute('href'); // $target_url = "http://www.articlesbase.com" . $para->getAttribute('href');
    	} elseif($lang == "fr") {
    		$target_url = $para->getAttribute('href'); // $target_url = "http://fr.articlesbase.com" . $para->getAttribute('href');	
    	} elseif($lang == "es") {
    		$target_url = $para->getAttribute('href'); // $target_url = "http://www.articuloz.com" . $para->getAttribute('href');	
    	} elseif($lang == "pg") {
    		$target_url = $para->getAttribute('href'); // $target_url = "http://www.artigonal.com" . $para->getAttribute('href');	
    	} elseif($lang == "ru") {
    		$target_url = $para->getAttribute('href'); // $target_url = "http://www.rusarticles.com" . $para->getAttribute('href');	
    	}		
    
    	// make the cURL request to $search_url
    	$ch = curl_init();
    	curl_setopt($ch, CURLOPT_USERAGENT, $ua);
    	curl_setopt($ch, CURLOPT_URL,$target_url);
    	curl_setopt($ch, CURLOPT_FAILONERROR, true);
    	curl_setopt($ch, CURLOPT_AUTOREFERER, true);
    	/*
    	 Proxy
    	if($proxy != "") {
    	curl_setopt($ch, CURLOPT_HTTPPROXYTUNNEL, 1); 
    	curl_setopt($ch, CURLOPT_PROXY, $proxy);
    	if($proxyuser) {curl_setopt($ch, CURLOPT_PROXYUSERPWD, $proxyuser);}
    	if($proxytype == "socks") {curl_setopt ($ch, CURLOPT_PROXYTYPE, CURLPROXY_SOCKS5);}
    	}	
    	*/
    	curl_setopt($ch, CURLOPT_RETURNTRANSFER,true);
    	curl_setopt($ch, CURLOPT_TIMEOUT, 45);
    	$html= curl_exec($ch);
    	if (!$html) {
    		echo "<br />cURL error number:" .curl_errno($ch);
    		echo "<br />cURL error:" . curl_error($ch);
    		exit;
    	}
    	curl_close($ch);
    
    	// parse the html into a DOMDocument  
    
    		$dom = new DOMDocument();
    		@$dom->loadHTML($html);
    		
    
    	// Grab Article Title 
    
    		$xpath = new DOMXPath($dom);
    		$paras = $xpath->query("//div/h1");
    		
    		$para = $paras->item(0);
    		$title = $para->textContent;		
    		$title2 = $title;
    		
    		if (function_exists('ma_translate') && get_option('ma_trans_title') == 1 && get_option('ma_trans_article') == 1) {$title = ma_translate($title2);}			
    
    						
    	// Grab Article	
    	
    	if (get_option('ma_eza_grabmethod')=='old') {
    		$xpath = new DOMXPath($dom);
    		$paras = $xpath->query("//div[@class='article_cnt KonaBody']//p"); 
    
    		for ($i = 0;  $i < $paras->length; $i++ ) {  //$paras->length
    
    			$para = $paras->item($i);
    			$paragraph = $para->textContent;
    			
    			if ($paragraph != '') {
    					if (function_exists('ma_translate') && get_option('ma_trans_article') == 1) {$paragraph = ma_translate($paragraph);}
    			
    				$content .= $paragraph . ' ';
    				$content .= "<br/><br/>";
    			}
    		}		
    	} elseif (get_option('ma_eza_grabmethod')=='new') {
    		$xpath = new DOMXPath($dom);
    		$paras = $xpath->query("//div[@class='article_cnt KonaBody']"); 
    		$para = $paras->item(0);		
    		$string = $dom->saveXml($para);	
    
    		$string = strip_tags($string,'<p><strong><b><a><br>');
    		$string = str_replace('<div class="KonaBody">', "", $string);	
    		$string = str_replace("</div>", "", $string);			
    		if (function_exists('ma_translate') && get_option('ma_trans_article') == 1) {$string = ma_translate($string);}
    
    		$content .= $string . ' ';
    	}
    	
    	// Grab Ressource Box	
    
    		$xpath = new DOMXPath($dom);
    		$parax = $xpath->query("//div[@class='author_details']/p");
    		//$para = $paras->item(0);		
    		//$ressourcetext = $dom->saveXml($para);	
    		
    		for ($i = 0;  $i < $parax->length; $i++ ) {  //$paras->length
    			$parac = $parax->item($i);
    			$ressourcetext .= $dom->saveXml($parac);	
    		}			
    
    		if (function_exists('ma_translate') && get_option('ma_trans_articlebox') == 1) {$ressourcetext = ma_translate($ressourcetext);}
    		
    		if ($ressourcetext != '') {
    		$authorbox = "<div style=\"margin:5px;padding:5px;border:1px solid #c1c1c1;font-size: 10px;\">" . $ressourcetext . "</div>";	
    		}	
    
    }		
    
    }
    		$textc = $content;
    		//$textc=substr_replace($textc, "...<!--more-->", 100, 0);
    		
    		//$textc = htmlspecialchars($textc, ENT_QUOTES);
    		if($lang == "es") {
    			//$textc = utf8_decode($textc);	
    		}
    		$authorbox = utf8_decode($authorbox);
    		$title = utf8_decode($title);
    		$content = get_option( 'ma_eza_template'); //'{thumbnail}{description}{link}');;	
    						// Clickbank		
    						$pos = strpos($content, "{clickbank}");		
    						if ($pos === false) {
    						} else {
    							$cbad = ma_getclickbank($keyword,"no");
    							if($cbad[4] != "") {
    								$content = str_replace("{clickbank}", $cbad[4], $content);							
    							} else {	
    								$content = str_replace("{clickbank}", "", $content);								
    							}							
    						}				
    						// Youtube			
    						$pos = strpos($content, "{video}");		
    						if ($pos === false) {
    						} else {
    							$vid = ma_getvideo($keyword2,1,0);
    							if($vid[8] != "") {
    								$content = str_replace("{video}", $vid[8], $content);							
    							} else {	
    								$content = str_replace("{video}", "", $content);								
    							}							
    						}		
    						// Flickr
    						preg_match('#\{image(.*)\}#iU', $content, $matches);
    						if ($matches[0] == false) {
    						} else {
    							if($matches[1] != false ) {$imgkeyword = substr($matches[1], 1);} else {$imgkeyword = $keyword;}
    					
    							$img = ma_getimage($imgkeyword,1,0);
    							if($img[4] != "" && $img[4] != "i") {
    								$image = '<img style="float:left;margin: 0 20px 10px 0;" src="'.$img[4].'" width="'.get_option("ma_fl_twidth").'" />';
    								$content = str_replace("{date}", $img[1], $content);
    								$content = str_replace("{owner}", $img[2] , $content);
    								$content = str_replace("{largeimage}", $img[6], $content);
    								$content = str_replace($matches[0], $image, $content);
    								$fllink = 'http://www.flickr.com/photos/'.$img[7].'/'.$img[8];
    								$content = str_replace("{imageurl}", $fllink, $content);
    							} else {
    								$content = str_replace("{date}", "", $content);
    								$content = str_replace("{owner}", "", $content);
    								$content = str_replace("{largeimage}","", $content);								
    								$content = str_replace($matches[0], "", $content);
    								$content = str_replace("{imageurl}", "", $content);								
    							}
    						}	
    						// eBay
    						preg_match('#\{auction(.*)\}#iU', $content, $matches);
    						if ($matches[0] == false) {
    						} else {
    							if($matches[1] != false ) {$aucnum = substr($matches[1], 1);} else {$aucnum = 1;}
    							$content = str_replace($matches[0], '[eba kw="'.$keyword2.'" num="'.$aucnum.'" ebcat=""]', $content);						
    						}	
    		$content = str_replace("{article}", $textc, $content);			
    		$content = str_replace("{authorbox}", $authorbox, $content);	
    		$content = str_replace("{keyword}", $keyword2, $content);	
    		$content = str_replace("{url}", $target_url, $content);	
    		
    		$insert = ma_insertpost($content,$title,$cat);			
    		if ($insert == false) {return false;} else {return true;} //ma_post($which);		
    
    }
    
    function ma_eza_options() {
    ?>
    		<table width="100%" cellspacing="2" cellpadding="5" class="editform"> 
    			<tr valign="top"> 
    				<td width="30%" scope="row">Article Source:</td> 
    				<td>
    				<select name="ma_eza_source" id="ma_eza_source">
    					<option value="articlesbase" <?php if (get_option('ma_eza_source')=='articlesbase') {echo 'selected';} ?>>Articlesbase.com</option>
    					<option value="sooperarticles" <?php if (get_option('ma_eza_source')=='sooperarticles') {echo 'selected';} ?>>Sooperarticles.com</option>
    				</select>
    				</td> 
    			</tr>			
    			<tr valign="top"> 
    				<td width="30%" scope="row">Article Formatting Method:</td> 
    				<td>
    				<select name="ma_eza_grabmethod" id="ma_eza_grabmethod">
    					<option value="new" <?php if (get_option('ma_eza_grabmethod')=='new') {echo 'selected';} ?>>Leave Formatting Intact</option>
    					<option value="old" <?php if (get_option('ma_eza_grabmethod')=='old') {echo 'selected';} ?>>Replace Formatting</option>
    				</select> <a href="http://wprobot.net/documentation/#34"><b>?</b></a>
    				</td> 
    			</tr>	
    			<tr valign="top"> 
    				<td width="30%" scope="row">Article Language:</td> 
    				<td>
    				<select name="ma_eza_lang" id="ma_eza_lang">
    					<option value="en" <?php if(get_option('ma_eza_lang')=="en"){_e('selected');}?>>English</option>
    					<option value="fr" <?php if(get_option('ma_eza_lang')=="fr"){_e('selected');}?>>French</option>
    					<option value="es" <?php if(get_option('ma_eza_lang')=="es"){_e('selected');}?>>Spanish</option>
    					<option value="pg" <?php if(get_option('ma_eza_lang')=="pg"){_e('selected');}?>>Portuguese</option>
    					<option value="ru" <?php if(get_option('ma_eza_lang')=="ru"){_e('selected');}?>>Russian</option>
    				</select>
    			</td> 
    			</tr>				
    			<tr valign="top"> 
    				<td width="30%" scope="row">Post Template:</td> 
    				<td>			
    			<textarea name="ma_eza_template" rows="2" cols="30"><?php echo get_option('ma_eza_template');?></textarea>	
    				</td> 
    			</tr>					
    		</table>
    <?php
    }
    ?>
    
    PHP:
     
    smashedpumpkins, Mar 27, 2010 IP
  2. Alex Roxon

    Alex Roxon Active Member

    Messages:
    424
    Likes Received:
    11
    Best Answers:
    7
    Trophy Points:
    80
    #2
    I guess you try preg_replace. Something like:

    $Content = preg_replace( '/\<a href\=[\"\\\'].*?\<\/a\>/i', '', $Content);
    PHP:
     
    Alex Roxon, Mar 27, 2010 IP
  3. hugsbunny

    hugsbunny Peon

    Messages:
    23
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #3
    I think you can use strip_tags and type <a> tag , I dont remember the syntax , but I think it will work
     
    hugsbunny, Mar 27, 2010 IP
  4. JAY6390

    JAY6390 Peon

    Messages:
    918
    Likes Received:
    31
    Best Answers:
    0
    Trophy Points:
    0
    #4
    $content = preg_replac(%'</?a\b[^>]*>%', '', $content);
    PHP:
    Not read your code, but that is how you do it. Just replace $content with your content variable
     
    JAY6390, Mar 27, 2010 IP
  5. smashedpumpkins

    smashedpumpkins Peon

    Messages:
    40
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #5
    Great, I really appreciate it! I'll try it out and report back. Thanks
     
    smashedpumpkins, Mar 27, 2010 IP
  6. danx10

    danx10 Peon

    Messages:
    1,179
    Likes Received:
    44
    Best Answers:
    2
    Trophy Points:
    0
    #6
    $content = preg_replace('%</?a\b[^>]*>%', '', $content);
    PHP:
     
    danx10, Mar 27, 2010 IP
  7. smashedpumpkins

    smashedpumpkins Peon

    Messages:
    40
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #7
    I went ahead and tried all 3 listed codes, but none worked. I tried it on the content variable and the paragraph variable. The links always remained intact. By looking at my example can you see why this might not work? I wish I could be of more help. I really appreciate the help you're all offering.
     
    smashedpumpkins, Mar 28, 2010 IP
  8. smashedpumpkins

    smashedpumpkins Peon

    Messages:
    40
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #8
    They're taken from an open and free article database.

    I've tried using the following with no success.
    $content = preg_replace('%</?a\b[^>]*>%', '', $content);
    PHP:
    I noticed the script has a similar option for other features. I've tried to copy it over, but I can't figure it out. I've used the below code with both paragraph and content, but the links still exist! Any ideas based on this code?

    
    $paragraph = ma_strip_selected_tags($paragraph, array('a','iframe','script'));
    PHP:
    
    function ma_strip_selected_tags($text, $tags = array()) {
        $args = func_get_args();
        $text = array_shift($args);
        $tags = func_num_args() > 2 ? array_diff($args,array($text))  : (array)$tags;
        foreach ($tags as $tag){
            while(preg_match('/<'.$tag.'(|\W[^>]*)>(.*)<\/'. $tag .'>/iusU', $text, $found)){
                $text = str_replace($found[0],$found[2],$text);
            }
        }
        return preg_replace('/(<('.join('|',$tags).')(|\W.*)\/>)/iusU', '', $text);
    }
    PHP:
     
    smashedpumpkins, Mar 28, 2010 IP
  9. danx10

    danx10 Peon

    Messages:
    1,179
    Likes Received:
    44
    Best Answers:
    2
    Trophy Points:
    0
    #9
    Try:

    $content = preg_replace('~<(a.*)href=(?:"|\')(.*?)(?:"|\')(.*)</a>~i', '', $content);
    PHP:
     
    Last edited: Mar 28, 2010
    danx10, Mar 28, 2010 IP
  10. JAY6390

    JAY6390 Peon

    Messages:
    918
    Likes Received:
    31
    Best Answers:
    0
    Trophy Points:
    0
    #10
    The regex I gave works perfectly well. The content must not be correct
     
    JAY6390, Mar 28, 2010 IP
  11. smashedpumpkins

    smashedpumpkins Peon

    Messages:
    40
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #11
    As you can see here I used the preg_replace code. (I've tried every example you've all given me) Paragraph has all of the content in the variable. I don't understand why it's not working either. It's becoming a real pain. I appreciate the help.
    
    		$xpath = new DOMXPath($dom);
    		$paras = $xpath->query("//div[@id='KonaBody']//p"); 
    
    		for ($i = 0;  $i < $paras->length; $i++ ) {  //$paras->length
    
    			$para = $paras->item($i);
    			$paragraph = $para->textContent;
    			$paragraph = preg_replace('~<(a.*)href=(?:"|\')(.*?)(?:"|\')(.*)</a>~i', '', $paragraph);
    			
    			if ($paragraph != '') {
    					if (function_exists('ma_translate') && get_option('ma_trans_article') == 1) {$paragraph = ma_translate($paragraph);}
    				$content .= $paragraph . ' ';
    				$content .= "<br/><br/>";
    			}
    		}
    
    PHP:
     
    smashedpumpkins, Mar 28, 2010 IP
  12. danx10

    danx10 Peon

    Messages:
    1,179
    Likes Received:
    44
    Best Answers:
    2
    Trophy Points:
    0
    #12
    Reply with the html source which contains the links you want replaced, you can get the source by:

    highlight_string($content);
    PHP:
    and then copying the about ^.
     
    danx10, Mar 29, 2010 IP
  13. smashedpumpkins

    smashedpumpkins Peon

    Messages:
    40
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #13
    I've finally had some success! I've probably spent over 20 hours on this issue. I found 2 different spots I have to place the code. There's a duplicate location for another article source. Any who it's working! I did find one article that snuck by with some links. I checked our the html code and here it is. I'm using the php code below from danx10. Would I need a different code to get rid of a link like this?

    Here's the HTML code for the link.
    <a onclick="javascript:pageTracker._trackPageview('/outgoing/article_exit_link');" rel="nofollow" href="http://www.datingonlinesingles.blogspot.com"><strong>click here to join the best free dating site.</strong>.</a>
    
    Code (markup):
    Here's the PHP code I'm using.
    $content = preg_replace('~<(a.*)href=(?:"|\')(.*?)(?:"|\')(.*)</a>~i', '', $content);
    PHP:

    Oh and can you possibly help with one other thing. I've found that many articles have links that aren't active. So instead of a clickable link the user has to copy and paste it. How can I use the preg_replace to search for lets says www?

    EDIT: After looking at the code I'm assuming you made it more detailed by adding in href? I'm going to try your previous example as well to see if it'll clear them all. EDIT: Didn't work. Same issue as above.
    $content = preg_replace('%</?a\b[^>]*>%', '', $content);
    PHP:
     
    Last edited: Mar 29, 2010
    smashedpumpkins, Mar 29, 2010 IP
  14. danx10

    danx10 Peon

    Messages:
    1,179
    Likes Received:
    44
    Best Answers:
    2
    Trophy Points:
    0
    #14
    This would work:

    $content = preg_replace('~<(a.*)href=(?:"|\')(.*?)(?:"|\')(.*)</a>~i', '', $content);
    PHP:
    But it could easily be shortend :) too:

    $content = preg_replace('~<(a.*)href=(.*?)</a>~i', '', $content);
    PHP:
    Proof that it works:

    <?php
    
    //input
    $content = <<<eof
    Test text containing urls..
    <a onclick="javascript:pageTracker._trackPageview('/outgoing/article_exit_link');" rel="nofollow" href="http://www.datingonlinesingles.blogspot.com"><strong>click here to join the best free dating site.</strong>.</a>
    
    <a href="http://digitalpoint.com">Test Url</a>
    eof;
    
    
    $content = preg_replace('~<(a.*)href=(.*?)</a>~i', '', $content);
    
    //output - Test text containing urls..
    echo $content;
    
    ?>
    PHP:

    Ellaborate?, give me an example inactive link, and what you'd want it look like (removed, replaced/formatted in a specific way??)
     
    danx10, Mar 29, 2010 IP
  15. smashedpumpkins

    smashedpumpkins Peon

    Messages:
    40
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #15
    It works! I can now remove all links! I found a better spot to place it and it worked. WOW finally! Now I have 2 questions.

    I'd like to remove links without A HREF tags. How can I go about removing these links? For example, the link below does not use A HREF tags. Therefore it won't be removed from the article.
    http://www.google.com
    Code (markup):
    This link has A HREF tags and will be removed.
    [url]http://www.google.com[/url]
    Code (markup):
    Second, is there a way to remove the link tags, but keep the words? For example, instead of a link below it would say just say Google.
    Google
     
    smashedpumpkins, Mar 29, 2010 IP
  16. K.Meier

    K.Meier Well-Known Member

    Messages:
    281
    Likes Received:
    4
    Best Answers:
    0
    Trophy Points:
    110
    #16
    That would be the explode function you are looking for. Look at the examples and see if you can come up with something. Shouldn't be that hard.
     
    K.Meier, Mar 29, 2010 IP
  17. danx10

    danx10 Peon

    Messages:
    1,179
    Likes Received:
    44
    Best Answers:
    2
    Trophy Points:
    0
    #17
    @smashedpumpkins

    You can use the following regex:

    $content = preg_replace('~<(a.*)href=(.*?)>(.*)</a>~i', '$3', $content);
    PHP:
    This should remove all a href links, and if they contain link text, the link text will remain but the link will be removed. Furthermore non a href links won't be touched (theirfore will not be effected, so you don't need a regex for that).

    Example:
    <?php
    
    //input
    $content = <<<eof
    <!--non a href tag example...-->
    
    http://www.google.com
    
    <!--a href tag containing link text example...->
    
    <a href="http://www.google.com" target="_blank">Google</a>
    eof;
    
    
    $content = preg_replace('~<(a.*)href=(.*?)>(.*)</a>~i', '$3', $content);
    
    //ouput - http://www.google.com Google
    echo $content;
    
    ?>
    PHP:
     
    danx10, Mar 30, 2010 IP
  18. stOK

    stOK Active Member

    Messages:
    114
    Likes Received:
    2
    Best Answers:
    0
    Trophy Points:
    53
    #18
    danx10, please stop suggesting crap. Better spend some time doing homework and study why your code works only on few examples.

    JAY6390 already gave almost perfect solution except the regexp should be case insensitive.

    
    $content = preg_replace('%</?a\b[^>]*>%[COLOR="Red"]i[/COLOR]', '', $content);
    
    Code (markup):
     
    stOK, Mar 30, 2010 IP
  19. JAY6390

    JAY6390 Peon

    Messages:
    918
    Likes Received:
    31
    Best Answers:
    0
    Trophy Points:
    0
    #19
    In fairness, the regex shouldn't need to be case insensitive, any valid html would have lowercase <a> tags (which is why I didn't put the insensitive flag on it to begin with
     
    JAY6390, Mar 30, 2010 IP
  20. stOK

    stOK Active Member

    Messages:
    114
    Likes Received:
    2
    Best Answers:
    0
    Trophy Points:
    53
    #20
    Wrong. you might mean xhtml. HTML element names are case-insensitive.
     
    stOK, Mar 30, 2010 IP