PHP Based Site Scanner?

Discussion in 'PHP' started by Doctor, Jul 12, 2007.

  1. #1
    Hi There,
    I'm looking for a script that can automatically search a number of (pre-defined) websites for a given term (say a name pulled from a Database)
    Are there any such scripts avaliable on the interweb? (it doesn't even have to do everything, as it could be used only as a starting point)

    Or failing one being already in existence, would there be any advice you could give me in making one of my own?

    Thanks

    Doctor
     
    Doctor, Jul 12, 2007 IP
  2. ecentricNick

    ecentricNick Peon

    Messages:
    351
    Likes Received:
    13
    Best Answers:
    0
    Trophy Points:
    0
    #2
    ecentricNick, Jul 12, 2007 IP
  3. Doctor

    Doctor Peon

    Messages:
    3
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #3
    would i be able to set up a script to use google to scan dozens of sites for 100's of terms at regular times?

    (obviously, the timing could be executed with a cron job)

    regardless, thanks for your speedy reply

    Doctor
     
    Doctor, Jul 12, 2007 IP
  4. 4GU.RU

    4GU.RU Member

    Messages:
    60
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    41
    #4
    site:.com casino
     
    4GU.RU, Jul 12, 2007 IP
  5. ecentricNick

    ecentricNick Peon

    Messages:
    351
    Likes Received:
    13
    Best Answers:
    0
    Trophy Points:
    0
    #5
    Well, the advantage of going via a search engine is you don't have to crawl all the sites yourself which...

    a) Is going to make the process much easier
    b) Isn't going to leave a trail in the site's server logs that you're crawling their site

    But, on the downside:
    a) I suspect Google will block you unless you limit the frequency of your requests
    b) You'll only know of the term if it's on a page Google (or which ever engine you use) indexes

    So, it's a little swings and roundabouts.

    But yes, in theory, you can do that.

    You can drive it from a CRON job, and then parse the results google (or whatever) returns to double check your required term does actually appear.
     
    ecentricNick, Jul 12, 2007 IP
  6. Doctor

    Doctor Peon

    Messages:
    3
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #6
    well, the plan i had was to search the "latest news" pages for the latest news regarding football teams, businesses and the like and also the frequency they appear,

    but as you said, i'd be limited to pages actually within the search engine.

    thanks for your help
     
    Doctor, Jul 12, 2007 IP
  7. ecentricNick

    ecentricNick Peon

    Messages:
    351
    Likes Received:
    13
    Best Answers:
    0
    Trophy Points:
    0
    #7
    Again, I would suggest that sounds like something that you'd get without much coding by using....

    Google Alerts
    The Yahoo Pipes thingy
    An RSS aggregator

    I'm not trying to be deliberately negative... just sounds like re-inventing the wheel to me?
     
    ecentricNick, Jul 12, 2007 IP
  8. Dakuipje

    Dakuipje Peon

    Messages:
    931
    Likes Received:
    15
    Best Answers:
    0
    Trophy Points:
    0
    #8
    Seems like you are just looking for RSS feeds.
    Most news sites have RSS feeds.
     
    Dakuipje, Jul 13, 2007 IP