SitePoint Sponsor

User Tag List

Results 1 to 11 of 11
  1. #1
    SitePoint Zealot
    Join Date
    Jul 2005
    Location
    Osoyoos BC Canada
    Posts
    178
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    word.doc to .html

    I was recently send several dozen medical research word.doc files to convert to .html

    Since I don't use any microsoft progs, I opened the files in 'open office' copied/pasted them into dreamweaver, cleaned up the extra tags etc and uploaded them.

    The customer emailed "all the references are missing".... What references? I thought.

    Looking at the files in word, sure enough there are bracketed references such as(34) throughout the pages and a list of numbered references at the end, but in both 'open office' and in wordperfect, the brackets, numbers and lists of references are missing from the page... And there is no indication that they are missing!

    So beware!

  2. #2
    . shoooo... silver trophy logic_earth's Avatar
    Join Date
    Oct 2005
    Location
    CA
    Posts
    9,013
    Mentioned
    8 Post(s)
    Tagged
    0 Thread(s)
    Did you know Dreamweaver can convert Word Documents all on its own? Aleast the latest version does.
    Logic without the fatal effects.
    All code snippets are licensed under WTFPL.


  3. #3
    SitePoint Evangelist Ed Seedhouse's Avatar
    Join Date
    Aug 2006
    Location
    Victoria, B.C. Canada
    Posts
    592
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by logic_earth View Post
    Did you know Dreamweaver can convert Word Documents all on its own? Aleast the latest version does.
    The version I have can also do this, but it creates lousy bloated code that isn't nearly as bad as Word does on it's own admittedly, but is still very bad. The worst habit is it slavishly accepts Word's fonts and puts them into inline CSS. Ugh.

    I find that if I select a region of the file and use cut and paste to copy it over into design view the resultant code isn't all that bad and is fairly easily fixed up.
    Ed Seedhouse

  4. #4
    SitePoint Zealot
    Join Date
    Jul 2005
    Location
    Osoyoos BC Canada
    Posts
    178
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    As Ed observes DW doesn't do a particularly good job of it, and the inline styles are a pain to remove! (Although Ctrl_F certainly speeds things up.)

    I've tried a whole pile of 'tools' that claim to create .html from word docs, but have yet to find one that creates a modern 'separation' of style sheet and clean html.

  5. #5
    Brevity is greatly overrated brandaggio's Avatar
    Join Date
    Dec 2005
    Posts
    1,424
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by colinmcc View Post
    I was recently send several dozen medical research word.doc files to convert to .html
    As you have found, you can't use Word to adequately create HTML, certainly not with proper separation of CSS, clean markup and the same look (conversion tool or no conversion tool).

    If you stick to using Word to produce your content/copy, it is great for that - word processing - then copy and paste the copy from your Word doc (or save it as plain text) into your favorite plain text editor like Notepad to convert it to plain text so you can use it in your template/layout Dreamweaver page, now free from of the burden of all its MSO tags (or just stick to hand coding it with the text editor you are using).

  6. #6
    SitePoint Addict StuckRUs's Avatar
    Join Date
    Jul 2006
    Location
    UK
    Posts
    286
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    The other option is to convert the lot to pdf files and have them as downloadable links. There is a plug-in for Word that converts for you. It's not ideal and if you're after accessible it's not great as Acrobat will report some errors on it's accessibility checks but it's cleaner than Word and less hassle than page rebuilds.

    If you have the time though it's best to start from scratch with the content as mentioned above.
    SMILE! everyone will wonder what you're up to.
    Site - under construction - again

  7. #7
    SitePoint Zealot
    Join Date
    Jul 2005
    Location
    Osoyoos BC Canada
    Posts
    178
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    My customer is in Australia and uses word on a mac, so he sends the pages as word.docs .. Over the last year I've actually created a web site of nearly 200 docs for him, all converted into semantically correct & separated css/html using the 'cut/paste into dw in design mode, switch dw to code mode, find/replace etc' route... I was geting pretty good at that 'till he started sending pages with references, which I never saw/knew about/converted since 'open office' fails to display them.

  8. #8
    Brevity is greatly overrated brandaggio's Avatar
    Join Date
    Dec 2005
    Posts
    1,424
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by colinmcc View Post
    My customer is in Australia and uses word on a mac, so he sends the pages as word.docs .. Over the last year I've actually created a web site of nearly 200 docs for him, all converted into semantically correct & separated css/html using the 'cut/paste into dw in design mode, switch dw to code mode, find/replace etc' route... I was geting pretty good at that 'till he started sending pages with references, which I never saw/knew about/converted since 'open office' fails to display them.
    I have no idea what a reference in Word is - I am guessing it is an internal document link.

    Perhaps you and your client could both use AbiWord to share content. It is quite a good mature app with most of the functionality of Word while being free of many of its troubles.

    Another app which is decent at producing pages is Composer. Perhaps your client could use it instead of Word as its GUI is fairly similar and easy to use.

  9. #9
    Non-Member deathshadow's Avatar
    Join Date
    Jul 2006
    Location
    Dublin, NH
    Posts
    901
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by John Wozniak View Post
    Perhaps your client could use it instead of Word as its GUI is fairly similar and easy to use.
    Trying to tell a BUSINESS client that they cannot use a program they likely spent a grand on is just going to make them look elsewhere for talent.

    Word creates more work for those of use working on websites - you pretty much have to deal with it... especially in light of the other word processors like Abiword or OoO STILL being tinkertoys by comparison.

  10. #10
    Brevity is greatly overrated brandaggio's Avatar
    Join Date
    Dec 2005
    Posts
    1,424
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by deathshadow View Post
    Trying to tell a BUSINESS client that they cannot use a program they likely spent a grand on is just going to make them look elsewhere for talent.

    Word creates more work for those of use working on websites - you pretty much have to deal with it... especially in light of the other word processors like Abiword or OoO STILL being tinkertoys by comparison.
    I don't see converting Word docs as a problem if the client is paying for it in anyway shape or form (see my earlier post in the thread). I was suggesting using apples to apples whatever the apples may be.

    The OP will have to gauge his client relationship and of course should not suggest software if it will compromise things - if it helps simplify things by both using the same app - if that is Word or whatever it may be - that might help.

    AbiWord is not a Tinkertoy - boy I miss those and Legos and Lincoln logs precisely because they are not word processors thank goodness.

  11. #11
    SitePoint Addict
    Join Date
    Oct 2006
    Posts
    292
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    For converting word documents I use word cleaner which can batch convert and clean up the html output of word documents. Then do a manual search and replace using homesite to do a final clean, formatting and inserting the template layout in the right place.

    Used this process to convert over 2000 word docs and it worked a treat.

    Si


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •