How to crawl a website?

Discussion in 'PHP' started by krbsn, Dec 12, 2008.

  1. #1
    Hi

    I have some questions beacuse I don't know php&mysql well.I'm really new.
    Please help me .


    I read all forum on digitalpoint about this subject but I can't

    <? 
    $host="localhost";
    $uname="_user";
    $pass="user12";
    $database="db123";
    $connection=mysql_connect($host,$uname,$pass)
    	or die ("MySQL'e Baglanilamadi ".mysql_error());
    $result=mysql_select_db($database)
    	or die ("Veri Tabanina baglanilamadi");
    
    
    // Burda Adresi belirtiyoruz// 
    $adres="http://www.xxx.com/"; 
    $cikti[0]="http://www.xxx.com/631540/some_page/"; 
    
    do { 
    // Get links
    $cikti=@file_get_contents($adres.($cikti[0]));   
    $cikti=explode ('         <a href="', $cikti);   
    $cikti=explode ('"', $cikti[1]);   
    
    // Get Topic
    $cikti2=@file_get_contents($adres.($cikti[0]));  
    $cikti2=explode ('class="itiraftitle">', $cikti2);   
    $cikti2=explode ('</span>', $cikti2[1]); 
    
    // Get content ,story
    $cikti3=@file_get_contents($adres.($cikti[0]));  
    $cikti3=explode (' lblText">', $cikti3);   
    $cikti3=explode ('</span></div>', $cikti3[1]); 
    
    
    
    // insert all 
    mysql_query("Insert Into hikayeler (topic,story,active,ekleyen,tarih) values ('$cikti2[0]','$cikti3[0]','1','0',now())"); 
    } 
    // yes other one.. // 
    while($cikti[0]!==""); 
    
    
    ?>
    PHP:


    This is my crawl but itsnt work

    My information here..

    My table : hikayeler
    colomub: id,topic,story


    I want to use just this

    id= auto
    topic = between class="itiraftitle"> Test Topic </span></a>
    story= between lblText"> Test Story </span>



    How can do this ? or the other way ?


    Thanks every comment
     
    krbsn, Dec 12, 2008 IP
  2. javaongsan

    javaongsan Well-Known Member

    Messages:
    1,054
    Likes Received:
    7
    Best Answers:
    0
    Trophy Points:
    128
    #2
    you need to strip the links from the contents first, you can't explode an entire HTML page.
     
    javaongsan, Dec 12, 2008 IP
  3. izwanmad

    izwanmad Banned

    Messages:
    1,064
    Likes Received:
    14
    Best Answers:
    0
    Trophy Points:
    0
    #3
    crawl a website? you mean get the content of other website? try this:

    $fp = fopen("http://www.google.com/", "r");
    	
    		if(!$fp)
    		{
    			echo "Connection Failed.";
    			exit;
    		}
    		
    		$strlist = "";
    		while(!feof($fp))
    		{
    			$line = fgets($fp);
    			$strlist .= $line;
    		}
    		
    		fclose($fp);
    
    echo $strlist;
    PHP:
     
    izwanmad, Dec 12, 2008 IP