PHP Progammer Needed - Site Parsing

Discussion in 'PHP' started by highlands, Jul 5, 2008.

  1. #1
    Hi,

    I am looking for someone that can code a script that can do the following:

    Search Google for a keyword and then save all the URL's that the search returns into a MySQL database.

    For example if I searched for the term "toys" all the URLs that are returned from this result are then stored. The script would have to allow me to say how many X amount of pages I go into the search results.

    I would like the following information saved into the database:

    URL
    Date Added


    Is this something that could be done?
    Is there a easier way to doing this?

    If you could do this let me know and we can work something out :)

    EDIT:

    After looking about a bit something like this would be what I am looking for except not displaying the results, just dumping them into a Database or even a flat text file

    http://goohackle.com/scripts/google_parser.php
     
    highlands, Jul 5, 2008 IP
  2. Danltn

    Danltn Well-Known Member

    Messages:
    679
    Likes Received:
    36
    Best Answers:
    0
    Trophy Points:
    120
    #2
    I could do this but you need to keep in mind that Google will limit you to x amount of searches a day.

    As a believer of whitehatness I will not be providing a solution (proxy or otherwise) that goes around this.

    Dan
     
    Danltn, Jul 5, 2008 IP
  3. highlands

    highlands Peon

    Messages:
    22
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #3
    I will only be doing a handful of searches a day, no need for a proxy :)
     
    highlands, Jul 5, 2008 IP
  4. highlands

    highlands Peon

    Messages:
    22
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #4
    Have edited original post
     
    highlands, Jul 5, 2008 IP
  5. anwaraa

    anwaraa Active Member

    Messages:
    167
    Likes Received:
    3
    Best Answers:
    0
    Trophy Points:
    90
    Digital Goods:
    2
    #5
    I also think it is not a good idea to be messing around with google.
     
    anwaraa, Jul 6, 2008 IP
  6. Danltn

    Danltn Well-Known Member

    Messages:
    679
    Likes Received:
    36
    Best Answers:
    0
    Trophy Points:
    120
    #6
    I've already done this for him.

    And anwaraa, please extend your post and post why.

    Dan
     
    Danltn, Jul 6, 2008 IP
  7. anwaraa

    anwaraa Active Member

    Messages:
    167
    Likes Received:
    3
    Best Answers:
    0
    Trophy Points:
    90
    Digital Goods:
    2
    #7
    Googles webmaster guidelines prohibits automated scripts/software and a site could get banned for this.
     
    anwaraa, Jul 11, 2008 IP
  8. BDazzler

    BDazzler Peon

    Messages:
    215
    Likes Received:
    5
    Best Answers:
    0
    Trophy Points:
    0
    #8
    They USED to have a SOAP Api do do this with, but they dropped it without warning. :(

    Alexa API (which charges a small fee per search) will give you the same thing, but obviously without Google's algorithm.

    I wish I had the time, I'd say PM me for a quote, but I'm swamped right now. But, if you don't need google's precision, the Alexa sarch is pretty good. My amazon web service fee is something like 1 cent per month.
     
    BDazzler, Jul 11, 2008 IP
  9. zerofill

    zerofill Member

    Messages:
    34
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    45
    #9
    Blah I scrape them all the time and use their API...What? You think Google doesn't scrape?
     
    zerofill, Jul 11, 2008 IP
  10. BDazzler

    BDazzler Peon

    Messages:
    215
    Likes Received:
    5
    Best Answers:
    0
    Trophy Points:
    0
    #10
    Of course they do, that's WHAT they do. The scrape and and analyze etc. They just take measures from keeping us from doing the same thing ... and of course, they're the billionaires, and I'm still happy I got my car paid off.

    BUT, they will block scrapers if they detect them, and that makes scraping Google unrelaible. So, I pay the penny to amazon. It's easier and more reliable.
     
    BDazzler, Jul 11, 2008 IP