Find a phrase on a given webpage

Discussion in 'PHP' started by papa_face, Dec 22, 2006.

Thread Status:
Not open for further replies.
  1. #1
    Hello,
    I was wondering if someone could help me. I want to find something in particular on a webpage. For example. I would like to enter in a URL and have PHP search for a given phrase on the page, AND then return something else aswell.
    For example (again):
    I would like to find the price of an item listed at Amazon.com. I would enter the URL via a form for that item. PHP would then return to me the price of the item AND the text i was searching for.
    Search for the phrase "Price:"
    http://www.amazon.com/gp/product/B000BK35M4/ref=amb_link_3938752_1/103-7676544-5510231
    Would return to me: Price: $1,645.00.

    Is this possible? Sorry for the bad explanation lol.
     
    papa_face, Dec 22, 2006 IP
  2. krakjoe

    krakjoe Well-Known Member

    Messages:
    1,795
    Likes Received:
    141
    Best Answers:
    0
    Trophy Points:
    135
    #2
    yeah it would be, but if you know the url and you know the data you're looking for why wouldn't you just browse to the page ?
     
    krakjoe, Dec 22, 2006 IP
  3. papa_face

    papa_face Notable Member

    Messages:
    2,237
    Likes Received:
    67
    Best Answers:
    1
    Trophy Points:
    285
    #3
    I want to display the price (or whatever i chose to search for) on a page on my website for a lot of URLs.
     
    papa_face, Dec 22, 2006 IP
  4. drewbe121212

    drewbe121212 Well-Known Member

    Messages:
    733
    Likes Received:
    20
    Best Answers:
    0
    Trophy Points:
    125
    #4
    Well for starts, if you really wanted stuff from amazon.com, I would just recomend their API. You can get all the information about a product / a category / anything that they have released. Really cool thing.

    If it is any webpage....
    
    $webpage = file_get_contents("http://www.webpageurl");
    
    // What are we looking for?
    
    $look_for = "Search For This";
    
    if (strpos($webpage,$look_for))
    {
        echo "Found it!";
    }
    else
    {
        echo "Not Found.";
    }
    
    PHP:
    If you wanted to retrieve a dynamic value, you would have to go look at the html code and find the html that is surrounding what you are looking for.

    For instance if you wanted the price of $99.99, you look in the html code and it may have something like this enclosed around it.
    <span class="price">$99.99</span>

    That is a little more advanced and will require the use of regular expressions and preg_match. If you want Amazon.com's data, just use the API. Much easier and efficient IMO.
     
    drewbe121212, Dec 22, 2006 IP
  5. drewbe121212

    drewbe121212 Well-Known Member

    Messages:
    733
    Likes Received:
    20
    Best Answers:
    0
    Trophy Points:
    125
    #5
    drewbe121212, Dec 22, 2006 IP
  6. papa_face

    papa_face Notable Member

    Messages:
    2,237
    Likes Received:
    67
    Best Answers:
    1
    Trophy Points:
    285
    #6
    Hello,
    Thank you for your help so far. I think I am going to need to use preg_match as it wont be amazon i will be looking for something in.

    regards
     
    papa_face, Dec 23, 2006 IP
  7. phree_radical

    phree_radical Peon

    Messages:
    563
    Likes Received:
    18
    Best Answers:
    0
    Trophy Points:
    0
    #7
    I agree with drewbe's example, it puts you on the right track. The function
    strpos()
    PHP:
    returns a number that indicates the position within a string (haystack) where the search term (needle) was found. Some examples can be found at the link above. Here's an example from the context of this conversation so far:

    Let's say you study the html source for the "target" webpage and you find that the price always looks something like this:
    Price: <span>$2.50</span>
    HTML:
    Then if you are working with drew's script and change the strpos call to
    $offset = strpos($webpage,'Price: <span>')
    PHP:
    The text that you searched for is 13 characters in length, so if strpos returns 1842 into $offset, you will know that the price is at the offset 1855.

    You can use the function
    substr()
    PHP:
    to grab text from a desired offset in the string once you know where to look for it. Making sense? Click on the function names in the php code to consult the manual, it will rock you!
     
    phree_radical, Dec 23, 2006 IP
  8. papa_face

    papa_face Notable Member

    Messages:
    2,237
    Likes Received:
    67
    Best Answers:
    1
    Trophy Points:
    285
    #8
    Hello,
    Thanks for your help.
    <?php
    
    $webpage = file_get_contents("http://www.amazon.com/gp/product/B000BK35M4/ref=amb_link_3938752_1/103-7676544-5510231");
    
    $find = "Price:";
    
    $offset = strpos($webpage,$find);
    
    echo substr($webpage,$offset,40); ?>
    Code (markup):
    That is what I have, but it only returns "Price:", what am I doing wrong?

    regards

    edit: the above code now works :D thanks for your help!
     
    papa_face, Dec 23, 2006 IP
  9. papa_face

    papa_face Notable Member

    Messages:
    2,237
    Likes Received:
    67
    Best Answers:
    1
    Trophy Points:
    285
    #9
    Is it possible, to search the content displayed on the page, rather than the soruce code. The reason I ask is because the Price changes obviously depending on the page. Therefore if a price is smaller than a certain number of characters you get the html tags that are next to it in the code.
     
    papa_face, Dec 23, 2006 IP
  10. phree_radical

    phree_radical Peon

    Messages:
    563
    Likes Received:
    18
    Best Answers:
    0
    Trophy Points:
    0
    #10
    The mistake is passing $find, which only contains "Price:" still, to substr(). You'll want to use the $offset of $find to fetch a substring from out of the $webpage. Something like this:
    
    <?php
    $webpage = file_get_contents("http://www.amazon.com/gp/product/B000BK35M4/ref=amb_link_3938752_1/103-7676544-5510231");
    
    $find = "Price:";
    
    $offset = strpos($webpage,$find);
    
    $buffer = substr($webpage,$offset,40); // fetch 40 characters starting with "Price:"...
    ?>
    
    PHP:
    Most likely the actual price will be of an unpredictable length... it could be 2.00 (4 characters), or 20.00 (5 characters)... You'll probably want to grab the price along a little extra on the end using the first call to substr(), then use strpos to search within that for the text that comes after the numbers. You'll then have an offset to where the numbers begin, and an offset to where they end. You can use the difference between the two as the length to a second and final call to substr().

    This is my approach as a former C programmer--I'm willing to bet there's a more effecient method. But I wouldn't really hesitate to do it that way.

    Have a look, and if you get frustrated, I'll try to help finish off the example.
     
    phree_radical, Dec 23, 2006 IP
  11. papa_face

    papa_face Notable Member

    Messages:
    2,237
    Likes Received:
    67
    Best Answers:
    1
    Trophy Points:
    285
    #11
    Ive looked into ereg_replace:
    ereg_replace("[[:alpha:]]", "", $here);
    Code (markup):
    how can i get it to remove all characters except numbers?
    The currency symbol doesnt matter to me.
     
    papa_face, Dec 23, 2006 IP
  12. Barti1987

    Barti1987 Well-Known Member

    Messages:
    2,703
    Likes Received:
    115
    Best Answers:
    0
    Trophy Points:
    185
    #12
    
    
    $url = 'http://www.amazon.com/gp/product/B000BK35M4/ref=amb_link_3938752_1/103-7676544-5510231';
    
    $file = fopen($url,'r');
    while($cont = fread($file,1024657)){
    $content .= $cont;
    }
    fclose($fle);
    preg_match('/<b>Price:<\/b> <b class="price">\$(.*)<\/b>/Us',$content,$returns);
    //print_r($returns);
    
    PHP:
    Untested, should work.

    Peace,
     
    Barti1987, Dec 23, 2006 IP
  13. papa_face

    papa_face Notable Member

    Messages:
    2,237
    Likes Received:
    67
    Best Answers:
    1
    Trophy Points:
    285
    #13
    Thanks, that works perfectly for amazon. But i dont just want it for amazon. I want to search for something on a page and have it display a number. For instance. I would like it to search through my profile on here for the total number of posts I have. Then echo that number out to the browser.
     
    papa_face, Dec 24, 2006 IP
  14. Barti1987

    Barti1987 Well-Known Member

    Messages:
    2,703
    Likes Received:
    115
    Best Answers:
    0
    Trophy Points:
    185
    #14
    That wouldn't be possible. Unless you know the page and the position (place) of where the number is.

    Take this page for example, how many numbers are here? I would say 10 numbers min.

    Peace,
     
    Barti1987, Dec 24, 2006 IP
  15. papa_face

    papa_face Notable Member

    Messages:
    2,237
    Likes Received:
    67
    Best Answers:
    1
    Trophy Points:
    285
    #15
    papa_face, Dec 24, 2006 IP
  16. Barti1987

    Barti1987 Well-Known Member

    Messages:
    2,703
    Likes Received:
    115
    Best Answers:
    0
    Trophy Points:
    185
    #16
    Yes you can.

    But your question was how to do it for ANY page. For VB profile page. You will use this regular expression:

    
    '/Total Posts: <strong>(.*)<\/strong>/Us'
    
    PHP:
    As you can see its pretty easy, practice it a couple of times you'll be good.

    Peace,
     
    Barti1987, Dec 24, 2006 IP
Thread Status:
Not open for further replies.