Import big xml to oscommerce or mysql database

Posted by shadow_of__soul | Posted in Uncategorized | Posted on 02-10-2009

0

Hi,

recently i have needed to do a script that parse a big xml file and insert it to a oscommerce database. i wanted to write this post why can be useful if you are doing something like this and maybe with other CMS system or database schema :)

the script no only need to insert the products, also download the images, check if the product already ,xist and update the product with the info of the xml. i don’t going to describe the trivial process of check for the product, make the queries etc.. if you are seeing this post you are run with some of the 2 essential problem:

1)out of memory error

2)session timeout, php execution time limit or 500 internal error

so, here are the soltuion for your problems :)

1) out of memory error

this it’s happen why surely you are usng the xml_parse() to parse the xml. this function always going to give you a out of memory error why need to load all the document in the memory to use it.

the solution it’s simple, use XMLReader this going to allow to load all the xml file and let you review each element of the xml. the documentation it’s very clear of how to use it and with some of testing, in a few hours you should have working your parse function :) . the only “problem” can be it’s that’s only available from php version biggers that 5.1.2, i don’t think so but maybe you can run with some problem in some host :)

2)session timeout, php execution time limit or 500 internal error

well, this error can be fixed but going to be you more effort and ugly code but i didn’t thinked any other posible solution for this :s

if you have a php execution limit error, simply add: set_time_limit(0); this this going to let you run the script forever. of course this isn’t safe, why if you have a bug or something, can be running forever and consumming so much memory than you going to have your shosting account suspended, so be sure to test everything before you run it live :)

after you set the execution time to 0, if you still have a timeout error or a 500 internal error it’s surely happening by the mysql queries. the xml parse it’s very fast with XMLReader but the real time take the mysql queries and if you are using oscommerce and need to check too much thing going to be a nightmare :P

the only solution i come up for this, was to re-execute the script till i procesed all the xml. basically i have parse the xml into a multidimensional array like this:

$rows[$rownumber][elements]

so, i process all the xml (in this time i processed a 3MB xml) to a array, send it to a function that update the DB and processing it in 200 rows parts. after the part finished, i keep the number of rows processed and printed:

echo “processed…. $numberofrows”;

echo ‘<meta http-equiv=”refresh” content=”4;url=xmlimport.php?f=’.$filename2.’&cont=’.$continue.’&start=’.$finished.’” />’;

i used a meta and not a header() why with header it still crashed with a 500 internal error.

so, when the script load again in the function that parse the xml, i set:

if($_GET['cont']==true){

for($i=0;$i<$_GET['start'];$i++){

$xml->read();

}

}

with that, i have moved the pointer of the xmlReader object to the last inserted row, and then start to parse the rest. it continue till the rows to parse are none and the script stop :)

i hope this help someone to resolve this problem :)

Regards,

Shadow.

[Slashdot] [Digg] [Reddit] [del.icio.us] [Facebook] [Technorati] [Google] [StumbleUpon]

Write a comment