My current project involves dealing with multiple large xml files.
The Xml files are stored on my server and a from a pretty trustworthy source. I have data coming in from outside sources which are also pretty trustworthy, but as always security is an issue I would like to avoid vulnerabilities while not using to many resources/process’s or to much memory. the overall application is already going to run through function after function before the page is finally processed.
My Current plan of action:
Use variables that have already been set to search these xml files and display relevant results. Based on the link to the post below I have decided to use XMLReader for a faster, less memory using process.
- storing all the returned results for each record in a table IE:
<record identifier> <record> <xmlfile details> <xmlfile 2 details> <xmlfile 3 details> <etc>
- the application will initially search my db and then update as needed
http://blog.liip.ch/archive/2004/05/10/processing_large_xml_documents_with_php.html
I have been toying with idea of just creating a new table for each xml file in MySQL.
my road blocks are:
- Multiple XML files need to be updated daily (at least)
The files so far haven’t been any bigger than 8mb but still 500k to 900k results of checking if the record exists, checking if the record needs updating, and then updating the records that do need updating, would put a serious load on my server if i had to do it to 10 times a day wouldn’t it?
- Not to mention each xml file probably contains at least 400k results that would never be queried in my application.
Any thoughts before I get to far involved, Please remember I am a nub “weekend warrior” type so I need reasons to better understand what I am doing