Parsing Text with PHP or Javascript

Hi,

I have a long text document that I want to break into records and add into a MySql table. The document is not csv, just plain text. The structure is something like this:

Capital Letters form the primary topic heading for each topic in the document. Each topic is preceded by a carriage return and line feed (char 13, and char 10). In some cases, the topic begins with a space character. There are other capital letters throughout the text and they may also be preceded by a carriage return, line feed. I can identify them in the code to exclude them as topic breaks. For example, they are named SUMMARY, REFERENCES, etc. So the code could exempt them as topics.

I want to create MySql records consisting of two fields, the Topic (char 200), and the text that includes everything up to the next Topic.

Here is a basic layout where C means a capital letter, t=text, P=a carriage return/line feed, s=space, and S an occurrence of the word SUMMARY. Text can continue on for many lines and a carriage return/line feed separate paragraphs.

CCCCCCC, CCCCC, ttttt
tttttttt
tttttttt
ttttttt P

ttttttt
ttttttt
SUMMARY
ttttttt
ttttttt P

sCCCCCCC, CCCCC, ttttt

I would much appreciate some example code or other instruction on how to parse this text.

Thanks greatly,

Lyle

Hi Lyle,

Definitely it would be better to do this better back end with PHP. One strategy would be to split the string into an array.
You can iterate through the initial array to make a multidimensional array. Then loop through the array to insert the data into the database. I can’t be more specific without seeing the data.

Michael

Hi Michael,

Thank you and also for the link.

Lyle

Your welcome. I hope this helps.

E

After you have parsed the string into individual array elements you may find the ctype_upper() function useful.

http://www.php.net/manual/en/function.ctype-upper.php​