What programs/scripts/languages is best for a web scraper?

Discussion in 'Programming' started by Melloweitsj, Jul 1, 2014.

  1. #1
    I run a betting website in a specific niche and I wanted to look at the opportunity to add a page where you can get the latest odds from the different bookmakers (about 5 of them) and compare the odds between them.

    Now, I have ~0 knowledge in the programming field (I do know much about HTML and CSS though) and have no idea how one would go about creating such a program/script for my website and that is why I am looking to outsource it.

    But since I have no idea what goes into making such a feature either I feel like I could easily go wrong when hiring a freelancer. I were hoping someone with experience with this kind of work could share how they would go about creating such a script, which languages to use, databases, programs, etc, how long it would take and how much it would cost.

    Lets for arguments sake say that I would like to scrape the odds for the cycling races at Bet365, Unibet, BetVictor, TitanBet and PinnacleSports. This program is to look for updated odds every 30-60min and update my odds page whenever a scrape is done.
     
    Melloweitsj, Jul 1, 2014 IP
  2. PoPSiCLe

    PoPSiCLe Illustrious Member

    Messages:
    4,623
    Likes Received:
    725
    Best Answers:
    152
    Trophy Points:
    470
    #2
    I don't know the sites, but if they provide any kind of API to actually access their content, that would be the preferred way to go. If not, any server-side programming language can do this, PHP for instance can achieve this via cURL and running via a CRON-job on the server. That would probably not be too hard to do, depends a little on whether you want it to store this information for future reference, or just update the current odds. It also depends a bit on how serious you are, and how you feel about requiring the information (is the information legally available for you to download and play around with/use as you see fit)?
     
    PoPSiCLe, Jul 2, 2014 IP
  3. Melloweitsj

    Melloweitsj Well-Known Member

    Messages:
    123
    Likes Received:
    7
    Best Answers:
    0
    Trophy Points:
    108
    #3
    Thanks for the reply!

    Yes, most of these offers these to affiliates, so I can get access to this information. Pinnacle Sports has them open to everyone (access here: http://xml.pinnaclesports.com/pinnacleFeed.aspx). Is it easier to do it using APIs? What languages and programs would be used creating it then?

    Also, no need to store the information. Just scrape the odds at that time and paste the information in my odds comparison table on my site. It would be much like how www.oddsportal.com works, only that I will only use it for one sport, only have a few bookmakers, but maybe a few more bets added to compare them with.
     
    Melloweitsj, Jul 2, 2014 IP
  4. PoPSiCLe

    PoPSiCLe Illustrious Member

    Messages:
    4,623
    Likes Received:
    725
    Best Answers:
    152
    Trophy Points:
    470
    #4
    I see. If all you need is to show this in a comparison table, then a simple pulling and storing in an XML-file or similar would probably work. APIs will usually allow to get the information you need simply by polling the URL (or a specific connection) - ie. no need to really code much, just pull the information they provide, and parse it (if you pull the information via cURL, for instance, you might need to code your own interpreter to assure you get the correct data (as the total data will probably be more than you need)).
    So yes, APIs are the way to go (or simple feed-links, as you provided above).
    It can still be created via almost any programming language you can think of, it just needs a way to communicate and parse the information provided from the sites.
     
    PoPSiCLe, Jul 2, 2014 IP
  5. ThePHPMaster

    ThePHPMaster Well-Known Member

    Messages:
    737
    Likes Received:
    52
    Best Answers:
    33
    Trophy Points:
    150
    #5
    If you are looking for the fastest, then complied language is the way to go IMO.
     
    ThePHPMaster, Jul 5, 2014 IP