PHP get text from word document (.doc)

Happy new year Guys!

I am looking to grab text from a word document file on the fly. I have seen that this is possible using the DOM package, but unfortunately the software will be on a linux server and this package will not be available.

Is there any other way to read Word (.doc) files into a string?

Thanks.

I think you mean COM instead of DOM?

Take a look at this thread, it has some interesting information: http://stackoverflow.com/questions/188452/reading-writing-a-ms-word-file-in-php

Yes, sorry I meant COM :slight_smile:

I have seen most of the solutions talked about on there…

  1. Antiword - problem is this has to be installed server side and could prove to be a problem if we decide to distribute this software further down the line.

  2. phpLiveDocx - problem with this solution is that Zend has to be installed server side and it also seems a very big download (around 140 MB).

It seems there is no easy solution at this time - very strange as I thought it would of been quite a common task to read from Word documents.

I agree. Although I guess it’s because MS kept the Word specs closed for such a long time.

Hey, maybe someone should write a decent open source MSWord parser :smiley: