What's the best way to parse product (XML) feeds?

Discussion in 'PHP' started by Domainholder, Jun 5, 2009.

  1. #1
    Hi there,

    We would like to import product feeds (XML) from various affiliate networks in our database on a daily basis. What’s the best way to do this?

    We were thinking of downloading the feeds (some are around 50mb) on our server each day with a cronjob. Then we can parse them and store the data in a mysql db.

    Well parsing these feed isn’t going easy. Can you guys recommend me a good parser (free or paid) that is capable of parsing product feeds? (there are enough parsers for rss news feeds, but we are looking for an xml product feeds parser)

    PS: I just posted this message a few minutes ago and somehow it disappeared or got deleted.
     
    Domainholder, Jun 5, 2009 IP
  2. HorseGalleria

    HorseGalleria Peon

    Messages:
    91
    Likes Received:
    5
    Best Answers:
    0
    Trophy Points:
    0
    #2
    HorseGalleria, Jun 5, 2009 IP
  3. jestep

    jestep Prominent Member

    Messages:
    3,659
    Likes Received:
    215
    Best Answers:
    19
    Trophy Points:
    330
    #3
    How complicated are the actual feeds?

    I would definitely get them locally on the server, and then work with them from there. How difficult this is going to be is really dependent on the complexity of the feed.

    If they're fairly simple, and standard. Something like:

    <product>
    <title />
    <url />
    <price />
    </product>

    You could use simple XML to easily parse them.

    One problem you may run into, unless you can figure out how to read line-by-line, is that you will need at least the amount of RAM that the file is to parse the script. Also, if you set any variables while the script is looping, the RAM usage will go up considerably. I've gotten php to parse GIG plus files, but even at 50Mb it may take some testing and tuning to get them to process cleanly.
     
    jestep, Jun 5, 2009 IP
  4. Domainholder

    Domainholder Peon

    Messages:
    78
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #4
    Thanks for your feedback guys. I appreciate it :)
     
    Domainholder, Jun 8, 2009 IP
  5. Worldwidirectory

    Worldwidirectory Peon

    Messages:
    35
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #5
    If you are not interested in the whole feed, but certain tags, preg_match_all works faster.
     
    Worldwidirectory, Jun 8, 2009 IP
  6. givemeknol

    givemeknol Active Member

    Messages:
    22
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    88
    #6
    Using simple_xml ?
     
    givemeknol, Jun 8, 2009 IP
  7. kamm

    kamm Active Member

    Messages:
    32
    Likes Received:
    3
    Best Answers:
    0
    Trophy Points:
    98
    #7
    kamm, Jun 11, 2009 IP