I am needing to convert 40 or so pdfs (or docs) to html. Doing it by hand is not an option. There are plenty of converters out there. I’ve tested about 8 so far and they all spit out horrible massively bloated (and strange to tell you the truth) html. Does any one know of any converters that actually spit out somewhat clean code?

I don’t have any suggestions for a convertor that spits out clean code. In fact I would expect all of them to be flawed just like any wysiwyg editor.

You might try picking the best one you tested so far and then run the output through html tidy. I have used it before on several pages that I did not want to hand code.

Hey Ray! Thanks bud. For days and days I was search and testing hundreds of converters looking for the cleanest code one. Then I moved to simply finding one that showed identical to my pdfs instead. I figured pdf2htmlEX was going to be the ticket but I really don’t have it in me to compile and build that homebrew thing out. Then I stumbled across zamzar file converter. It uses pdf2htmlEX as its converter engine. It shows the html identical to the pdf. But it’s code is far from pretty. But I don’t care anymore. The visual is more important than what’s behind it in this case.

