Convert PDF to HTML

How to convert PDF to HTML?

Like I mentioned in the comment above, it is definitely possible to convert pdf to html using the tool Able2Extract7 which can be downloaded from here

I have been using this tool for almost 2 years now and I am pretty happy with it. This tool lets you convert PDF to Word, Excel, PowerPoint, Publisher, HTML, OO etc. See screenshot

Sample Image

Imp Note: This tool is not a freeware.

HTH

Converting from PDF to HTML

Writing a program to do it is definitely not trivial. If you don't find any .NET Library to do this (I couldn't, at least not free), I would just download this and invoke it programmatically to get my html.

If you have the time to spare and/or PDFToHtml does not produce acceptable output for you, you could use iText to write the program yourself. It's a very mature free pdf library. I've used it in the past to manipulate PDFs (merge, create, etc).

UPDATE

As noted in the comment by Quandary, the PDFSharp library offers a more relaxed license (MIT) compared to the Commercial or AGPL license offered by iText. Keep this is mind when choosing your library. I have not used the PDFSharp library myself and I don't know how they compare in terms of functionality.



Related Topics



Leave a reply



Submit