SitePoint Sponsor

User Tag List

Results 1 to 8 of 8
  1. #1
    SitePoint Addict
    Join Date
    Jun 2008
    Posts
    279
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Extract MS Word Info, Store in Database

    Hiya ,

    I love the power of PHP, but I'm unsure if this is possible. If yes, I am completely baffled on how it would be solved.

    My Question:

    My aim is to extract certain bits of information from an uploaded MS Word file, then store all gathered information in a MySQL database.

    I have made a start.
    I have completed the upload stage, only I am completely clueless on how to extract the information and store it.

    Upload MS Word File:

    PHP Code:
    <?php
    if ($_FILES["file"]["type"] == "application/msword")
    {
      if (
    $_FILES["file"]["error"] > 0)
      {
        echo 
    "Return Code: " $_FILES["file"]["error"] . "<br />";
      }
      else
      {
        if (
    file_exists("upload/" $_FILES["file"]["name"]))
        {
          echo 
    $_FILES["file"]["name"] . " already exists. ";
        }
        else
        {
          
    move_uploaded_file($_FILES["file"]["tmp_name"],
          
    "upload/" $_FILES["file"]["name"]);
          echo 
    "File Uploaded Successfully!";
        }
      }
    }
    else
    {
      echo 
    "Unsupported File Type.".'<br />';
      echo 
    "Please Upload MS Word (.doc) Files Only!".'<br />';
    }
    ?>

  2. #2
    SitePoint Zealot pavanpuligandla's Avatar
    Join Date
    Sep 2008
    Location
    hyderabad
    Posts
    179
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    hii..
    you can use COM object to start a MS WORD application,
    Code:
    Code PHP:
    $word = new COM("Word.Application") or die("MSwordl could not be started");
    $data = $word-> docs -> Open("path of your MS word file.");
    then decide database tables, i mean we can only export the entire MS WROD content into a single field as a single row in MYSQL table.

    then write INSERT query.
    and quit the application.
    Code:
    PHP Code:
    $word -> Quit(); 
    Try this..
    Many Regards.

  3. #3
    SitePoint Addict
    Join Date
    Jun 2008
    Posts
    279
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Thanks for your reply pavanpuligandla!
    I have executed the following code, and get an error message, why is this?:

    PHP Code:
    <?php
    $word 
    = new COM("Word.Application") or die("MSwordl could not be started");
    $data $word-> docs -> Open("letter.doc");
    echo 
    $data;

    $word -> Quit();
    ?>
    Error Message:

    Fatal error: Uncaught exception 'com_exception' with message 'Unable to lookup `docs': Unknown name. ' in C:\Apache\htdocs\word.php:3 Stack trace: #0 C:\Apache\htdocs\word.php(3): unknown() #1 {main} thrown in C:\Apache\htdocs\word.php on line 3

  4. #4
    SitePoint Zealot pavanpuligandla's Avatar
    Join Date
    Sep 2008
    Location
    hyderabad
    Posts
    179
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Thanks for your reply pavanpuligandla!
    I have executed the following code, and get an error message, why is this?:
    Hii,
    i just gave an idea, how to proceed using php COM..
    herez the original code which i'm using PHP COM to extract content from an MS Word file.. lemme know whether it worked or not?

    Code:
    PHP Code:
    <?php $word = new COM("word.application") or die ("Could not initialise MS Word object."); $word->Documents->Open(realpath("Sample.doc")); // Extract content. $content = (string) $word->ActiveDocument->Content; echo $content; $word->ActiveDocument->Close(false); $word->Quit(); $word null; unset($word); ?>
    Many Regards.

  5. #5
    SitePoint Zealot pavanpuligandla's Avatar
    Join Date
    Sep 2008
    Location
    hyderabad
    Posts
    179
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I'm sorry, the above is not visible, here it is..

    Code:
    <?php
    $word = new COM("word.application") or die ("Could not initialise MS Word object.");
    $word->Documents->Open(realpath("Sample.doc"));
    
    // Extract content.
    $content = (string) $word->ActiveDocument->Content;
    
    echo $content;
    
    $word->ActiveDocument->Close(false);
    
    $word->Quit();
    $word = null;
    unset($word);
    ?>

  6. #6
    SitePoint Addict
    Join Date
    Jun 2008
    Posts
    279
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    It works!
    Is it possible to display the text using the formatting and lines breaks and displayed inside the MS Word file?

    e.g. Bold Text etc?

  7. #7
    SitePoint Zealot pavanpuligandla's Avatar
    Join Date
    Sep 2008
    Location
    hyderabad
    Posts
    179
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Is it possible to display the text using the formatting and lines breaks and displayed inside the MS Word file?

    e.g. Bold Text etc?
    yes.
    but u need to include javascript to render the word content..
    at d point, which i'm using to render my excel worksheets graphically..

  8. #8
    SitePoint Addict
    Join Date
    Jun 2008
    Posts
    279
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I don't understand!

    So you need to use JavaScript, in order to format the content extracted from MS Word?


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •