Converting HTML files to PDF

I need to automatically generate a PDF file from an exisiting (X)HTML-document. The input files (reports) use a rather simple, table-based layout, so support for really fancy JavaScript/CSS stuff is probably not needed.

As I am used to working in Java, a solution that can easily be used in a java-project is preferable. It only needs to work on windows systems, though.

One way to do it that is feasable, but does not produce good quality output (at least out of the box) is using CSS2XSLFO, and Apache FOP to create the PDF files. The problem I encountered was that while CSS-attributes are converted nicely, the table-layout is pretty messed up, with text flowing out of the table cell.

I also took a quick look at Jrex, a Java-API for using the Gecko rendering engine.

Is there maybe a way to grab the rendered page from the internet explorer rendering engine and send it to a PDF-Printer tool automatically? I have no experience in OLE programming in windows, so I have no clue what's possible and what is not.

Do you have an idea?

EDIT : The FlyingSaucer/iText thing looks very promising. I will try to go with that.

Thanks for all the answers


The Flying Saucer XHTML renderer project has support for outputting XHTML to PDF. Have a look at an example here.


Did you try WKHTMLTOPDF?

It's a simple shell utility, an open source implementation of WebKit. Both are free.

We've set a small tutorial here

EDIT( 2017 ):

If it was to build something today, I wouldn't go that route anymore.
But would use http://pdfkit.org/ instead.
Probably stripping it of all its nodejs dependencies, to run in the browser.


Check out iText; it is a pure Java PDF toolkit which has support for reading data from HTML. I used it recently in a project when I needed to pull content from our CMS and export as PDF files, and it was all rather straightforward. The support for CSS and style tags is pretty limited, but it does render tables without any problems (I never managed to set column width though).

Creating a PDF from HTML goes something like this:

Document doc = new Document(PageSize.A4);
PdfWriter.getInstance(doc, out);
doc.open();
HTMLWorker hw = new HTMLWorker(doc);
hw.parse(new StringReader(html));
doc.close();
链接地址: http://www.djcxy.com/p/88490.html

上一篇: 使用PHP创建PDF

下一篇: 将HTML文件转换为PDF