Reading XML files

Discussion in 'PHP' started by grutland, Apr 28, 2010.

  1. #1
    Hi,

    I'm currently running a site that depends largely on XML files, it also requires a lot of mapping.
    In other words I have to match up the XML fields to the correct fields in my database.

    The problem I am having is that some of the XML files are fairly big.
    Is there any way of when loading an XML file, it only loads the first item, or that it ignores any of the data and just gets the structure of the XML file?

    I'm currently using SimpleXML to load the file, but open to other classes if they are better.

    Cheers
     
    grutland, Apr 28, 2010 IP
  2. bytes

    bytes Peon

    Messages:
    39
    Likes Received:
    2
    Best Answers:
    0
    Trophy Points:
    0
    #2
    bytes, Apr 28, 2010 IP
  3. nabil_kadimi

    nabil_kadimi Well-Known Member

    Messages:
    1,065
    Likes Received:
    69
    Best Answers:
    0
    Trophy Points:
    195
    #3
    SimpleXML is not practical for big file since it loads files to memory before processing them (It's a tree-based parser).

    XMLreader uses another approach (It's a stream-based parser), so you should get the job done easily with XMLreader

    See http://www.ibm.com/developerworks/xml/library/x-xmlphp2.html ... many examples provided
     
    nabil_kadimi, Apr 28, 2010 IP
  4. grutland

    grutland Active Member

    Messages:
    86
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    71
    #4
    Well there is one XML file I am loading which is 400 odd MB which I have no control over.
    All I want to do is to load the structure of it, but it takes ages.

    I have some automated scripts that load the file properly and uses all the data, but this is meant to be loaded in a browser so need it to be fairly quick.
    Trying to load a file that is 400 MB is crashing the browser.

    Any suggestions?
     
    grutland, Apr 28, 2010 IP
  5. nabil_kadimi

    nabil_kadimi Well-Known Member

    Messages:
    1,065
    Likes Received:
    69
    Best Answers:
    0
    Trophy Points:
    195
    #5
    Normally, the structure is defined on the DTD !!!
    The service or site that provide the XML file should provide the DTD too... IMHO

    For public display of the data, I suggest the following (quite common):
    * Parse the file(s) on the server and divide it into separate DB rows (1 DB row per undividable item)
    * When a visitor requests data, the data will be retrieved easily (using DB indexes, optimization, etc...)
     
    nabil_kadimi, Apr 28, 2010 IP
  6. grutland

    grutland Active Member

    Messages:
    86
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    71
    #6
    That was an option I considered in the past, but with SimpleXML loading the entire file it didn't seem very practicle.
    I've just been playing with XMLReader and it seems a lot faster because I can "break" it at any given point then I could potentially do this now.

    There is no DTD though as far as I am aware.
     
    grutland, Apr 28, 2010 IP