All examples I've seen so far use the PHP's XML parser. The basic idea is to load your xml file in memory, make a huge array from it then give it to the XML parser. $data = implode("", file($filename)); means that you'll have to read the entire file in the memory. You have to change the php.ini settings, in order to get the extra memory. For files larger than 16 MiB, processing takes a long time. You also have to change the php.ini file to increase the allowed execution time. Is there any better solution? For instance, an XML parser that will not load the entire file in the memory?
I can't think of a better way to do it, but you don't have to edit the .ini file. You can add the set_time_limit(0); Code (markup): code to your file and it will never time out.
You are right. It will not time out, but it may use too much resources on my shared hosting account. Another way to solve the problem is to write another XML parser (I already wrote one) that does not read the entire file into the memory. I was curios to know if there is another way, or maybe another better XML parser. I just wrote mine, but I always think "What if someone else did better?"
cURL will load the whole file in the memory too. The only way of avoiding that is using fopen(), fgets(), and fclose(). The only problem I see with this is, what if a single tag is in multiple lines. Eg: <tag>some data here Some more data</tag> Code (xml): ... the only way of parsing these would be by loading the whole file in the memory. Otherwise it might stop in the middle of the tag and it wouldn't parse correctly. Also: $data = implode("", file($filename)); PHP: ... this method was used a long time ago (when file_get_contents() didn't exist yet). Nowadays I suggest not doing this. Especially for 16 MB files. file() reads the whole file into the memory, and then explodes it by new lines. How many lines could a 16MB file possibly have? A lot... and then implode() joins this giant array which is a lot of stress for the server too. Use just file_get_contents(), which does the same as both functions together, just a lot faster.
In my case, I'm lucky. I know, for a fact, that the XML file that I'll parse has all the ending tags on the same row as the beginning tags. A general solution could be - step1 : replace all the <cr> with something unique (like : lkj2o793345l3) - step2 : read 8k at a time, or whatever will make sure you'll cover an entire <tag>...</tag> - step3 : replace back lkj2o793345l3 with <cr>, where necessary. I believe that fread stops when encountering an <cr>, therefore the method above tries to prevent that. What do you think?
You can use fgets(), it'll read one line form the given resource. Then you could use a regular expression to grab whatever you need. Something like: if (preg_match('~<[^>]+>(.*?)</\1>~s', $line, $match)) { echo $match[1]; } PHP:
Sorry to bump into your thread ForumJoiner, but I have been trying to understand "Parsing XML to PHP". Can you provide some help on this, which tools you use to parse XML to php? Please, I would appreciate your help a LOT. I'm in real need to know the tools you guys use to parse XML to PHP. Topics I'm trying to learn are Parsing XML to PHP, XSLT, Pear, SOAP (this is for Amazon Web Services, trying to learn which tools are best for the job). Please reply here or PM me. Any help would be greatly appreciated. Regards, Andy.
Actually, is parsing XML using PHP. More details here: http://www.php.net/manual/en/ref.xml.php You can use the help above, which will give you some functions and examples. This thread is about optimizing. PHP's XML parser described above loads the entire file into memory. I wanted a faster and less consuming choice. I'm still seeking it