Convert word document (.doc) into plain text

Hi Guys!

I have been looking for a solution that will allow us to upload word documents and extract the text from them and store it in a database. I have tried using an application called LiveDocx but unfortunately it only worked sporadically.

I have also tried PHPWordLib and unfortunately this program does not support .docx files - so we cannot use that.

I need something very reliable which will work at 100% all of the time.

I have since invested in a windows server to try and make use of the COM library. However, this has also proven unsuccessful and when I try to open a word document from a PHP script, I get the following error:


Fatal error: Uncaught exception 'com_exception' with message '<b>Source:</b> Microsoft Word<br/><b>Description:</b> This command is not available because no document is open.' in C:\\inetpub\\wwwroot\\my-test.com\\convert.php:16 Stack trace: #0 C:\\inetpub\\wwwroot\\my-test.com\\convert.php(16): unknown() #1 {main} thrown in C:\\inetpub\\wwwroot\\my-test.com\\convert.php on line 16

Has anyone got a reliable solution that I can use?

This is done on pretty much every recruitment website on the internet, yet there is no documentation about it.

I’ve used antiword and abiword on servers using the command line in the past, not sure if they can do docx though.