RSS and MySQL: Does this make any sense?

LongHaul Peon

Messages:: 670

Likes Received:: 13

Best Answers:: 0

Trophy Points:: 0

#1

I have an affiliate program in which a company offers an RSS feed of their products.

I want to do some stuff with a searchable web site, so this is what I'm thinking:

1. Get the RSS feed and convert (parse?) it into a MySQL database.
2. Do the search with PHP and stuff.

I'm ok with PHP and MySQL, it's the RSS (and step 1) that is new to me. Does the above make sense? I haven't actually seen the RSS file yet, I don't even know what it looks like! Is it just text?

Thanks.

LongHaul, Dec 25, 2006 IP

oziman Active Member

Messages:: 199

Likes Received:: 6

Best Answers:: 0

Trophy Points:: 58

#2

Makes sense.

I think the feed comes in XML, so you have to parse the XML to the DB..

PHP Has a built in XML Parser..

oziman, Dec 25, 2006 IP

Senpai IT Active Member

Messages:: 453

Likes Received:: 43

Best Answers:: 0

Trophy Points:: 68

#3

RSS is a kind of XML document.
Have you ever seen XML before?

Senpai IT, Dec 25, 2006 IP

Monty Peon

Messages:: 1,363

Likes Received:: 132

Best Answers:: 0

Trophy Points:: 0

#4

You can query an XML file with XQuery and XPath
You can think of XQuery being for XML what SQL is for database.

Maybe, depending of what you want to do, you don't need to fill up a db from your XML to extract data.
http://www.w3.org/XML/Query/
http://www.zend.com/php5/articles/php5-xmlphp.php

Monty, Dec 25, 2006 IP

LongHaul Peon

Messages:: 670

Likes Received:: 13

Best Answers:: 0

Trophy Points:: 0

#5

Thanks for the tips.

I've run into a bit of a snag though; the RSS feeds offered by this particular website are themed, and are only a small part of the advertiser's stock! They do me no good. (They're like top 50 sellers and things like that. I want to have a searchable box on my own site, as well as be able to manipulate/style the product data to my liking. I've emailed the company about it, hopefully I'll get some reply.

But that brings me to another problem I've wondered about: is it possible to write some php or javascript that reads info from the pages of some site and stores it somehow?

I know it's possible, but is it way difficult? If it's ok with the website owner, could I somehow download information off web pages into a database?

Thanks for the tips guys!

LongHaul, Dec 27, 2006 IP

oziman Active Member

Messages:: 199

Likes Received:: 6

Best Answers:: 0

Trophy Points:: 58

#6

Anything is possible.

What you're talking about is scraping.

I would suggest looking at the PHP Functions preg_match and preg_replace, and searching on Google or Oreilly for Regular Expression matching.

As an aside, for parsing RSS feeds in PHP, I came across a class called MagPie. Makes it sooo much simpler.

Good Luck!

oziman, Dec 28, 2006 IP

v0rtex Peon

Messages:: 2

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 0

#7

As luck would have it I'm just writing a site like this at the moment, and stumbled across your post when Googling.

I'd highly recommend using the free Magpie library for downloading and parsing RSS. Find it on Google ("magpie rss"). No point reinventing the wheel.

You set up a script which pulls down the RSS feed and sticks it into MySQL, then create a cron job which executes the script at a set interval (ie. hourly). Then your search engine on the front-end can do whatever it likes with the data.

Web scraping (extracting data from HTML) is a whole different kettle of fish. It can be done using the CURL library and a lot of regular expression wizardry, but if the retailer changes the layout of their site you'll have to do the work all over again. It's quite unreliable, but it can be done.

v0rtex, Jan 1, 2007 IP

5starAffiliates Well-Known Member

Messages:: 1,766

Likes Received:: 115

Best Answers:: 0

Trophy Points:: 155

#8

You may possibly be wasting lots of energy going in the wrong direction.
If they offer a product RSS feed then I bet they have a regular product datafeed they offer as CVS or XML. Wouldn't this be much easier?

KEY - if you are going to convert the data you need to be sure the affiliate link still tracks and the cookie is passed. Affiliate datafeeds already have your affiliate link embedded in each link. I've seen very few RSS feeds that have the affiliate links embedded.

5starAffiliates, Jan 2, 2007 IP

LongHaul Peon

Messages:: 670

Likes Received:: 13

Best Answers:: 0

Trophy Points:: 0

#9

In the end, I just downloaded the tab-delimited file (163MB!) and wrote a simple php page to read it and store the info in the MySQL database. It was easy and fast and there were no problems! I did donwload MagPie but it seemed complicated, and my task was fairly simple anyway. Read data from text file, store data in database. I guess that's what MagPie does, but I was put off by the large number of files and junk in the MagPie folder

I found the screen-scraping to be fairly easy too, though you have to watch not to scrape too many pages in a short time, I got my IP blocked from the site for a couple hours. I scrape info and store it into the database. If the vendor changes the page layout in the future, I can just adjust the script, but the old info will still be good.

Thanks for the tips everyone, especially oziman for the preg_match_all stuff!

LongHaul, Jan 2, 2007 IP

guitarpaul likes this.

Log in or Sign up

RSS and MySQL: Does this make any sense?

LongHaul Peon

oziman Active Member

Senpai IT Active Member

Monty Peon

LongHaul Peon

oziman Active Member

v0rtex Peon

5starAffiliates Well-Known Member

LongHaul Peon

Useful Searches