Trying to get my head round efficient ways of storing product data from an advertiser in my local database, in PHP/MySQL. Say my advertiser has a LOT of products, like 100,000 or more, and I want them in my local database so I'm not calling the API all the time and so I can do other sorting and searching on it. Any thoughts on a good model for this? I figure a daily cron-job that will pull the results in from the API, but I am thinking that's going to be a good hunderd API calls if I go that route (max 1000 products per page of results I believe). So if I order a Product Export in XML which I can pull down daily and import, seems like a better bet. But that again seems like a massive overhead - getting a 3-4Mb gzipped file via http, unzipped, that can be 150 Mb! Processing all that from a file on the local server, and populating my database with it, even once a day, sounds like a lot of server time which my host could slap me for. First time I have really tried to do something with this much data in PHP/mySQL, so wonder if anyone has any pointers? P.S. On shared hosting for the time being...
Hello I would run some local tests to see what the load is like first before crushing the shared hosting server. I think the parsing will be much faster than you think. Also, check to see if the you and the shared host is running mysql-5.1 or 6, there is a great SQL command for this.. 'LOAD XML LOCAL INFILE' I wrote a couple of blog posts that could be useful for you. FTP’ing files from CJ’s FTP server to your host and Parsing XML CJ Product Feeds HTH hanji
Thanks for that, your blog posts look to be very much along the lines of what I am looking for! I also found an article about LOAD DATA, which I didn't know about before. Looks like a solution to me - now just to write it
You said you are on shared hosting. Most shared hosting servers have a maximum script execution time set to scripts. (most probably 30 seconds). You need to take this into consideration and make sure you set a limit for your script to execute. Else the script would break and the work would be incomplete.
P.S. In your second article about FTPing there is a dead link to the first article, just to let you know. Like the blog BTW, very useful, have subscribed to your feed.