Convert HTML to PDF in .NET

I want to generate a PDF by passing HTML contents to a function. I have made use of iTextSharp for this but it does not perform well when it encounters tables and the layout just gets messy.

Is there a better way?


Try wkhtmtopdf. It is the best tool I have found so far.

For .NET, you may use this small library to easily invoke wkhtmtopdf command line utility.


EDIT: New Suggestion HTML Renderer for PDF using PdfSharp

(After trying wkhtmltopdf and suggesting to avoid it)

HtmlRenderer.PdfSharp is a 100% fully C# managed code , easy to use, thread safe and most importantly FREE (New BSD License) solution.

Usage

  • Download HtmlRenderer.PdfSharp nuget package.
  • Use Example Method.

    public static Byte[] PdfSharpConvert(String html)
    {
        Byte[] res = null;
        using (MemoryStream ms = new MemoryStream())
        {
            var pdf = TheArtOfDev.HtmlRenderer.PdfSharp.PdfGenerator.GeneratePdf(html, PdfSharp.PageSize.A4);
            pdf.Save(ms);
            res = ms.ToArray();
        }
        return res;
    }
    
  • A very Good Alternate Is a Free Version of iTextSharp

    Until version 4.1.6 iTextSharp was licensed under the LGPL licence and versions until 4.16 (or there may be also forks) are available as packages and can be freely used. Of course someone can use the continued 5+ paid version.

    I tried to integrate wkhtmltopdf solutions on my project and had a bunch of hurdles.

    I personally would avoid using wkhtmltopdf - based solutions on Hosted Enterprise applications for the following reasons.

  • First of all wkhtmltopdf is C++ implemented not C#, and you will experience various problems embedding it within your C# code, especially while switching between 32bit and 64bit builds of your project. Had to try several workarounds including conditional project building etc. etc. just to avoid "invalid format exceptions" on different machines.
  • If you manage your own virtual machine its ok. But if your project is running within a constrained environment like ( Azure (Actually is impossible withing azure as mentioned by the TuesPenchin author) , Elastic Beanstalk etc) it's a nightmare to configure that environment only for wkhtmltopdf to work.
  • wkhtmltopdf is creating files within your server so you have to manage user permissions and grant "write" access to where wkhtmltopdf is running.
  • Wkhtmltopdf is running as a standalone application, so its not managed by your IIS application pool . So you have to either host it as a service on another machine or you will experience huge processing spikes and memory consumption withing your production server.
  • It uses temp files to generate the pdf, and in cases Like AWS EC2 which has really slow disk i/o it is a big performance problem.
  • The most hated "Unable to load DLL 'wkhtmltox.dll'" error reported by many users.
  • --- PRE Edit Section ---

    For anyone who want to generate pdf from html in simpler applications / environments I leave my old post as suggestion.

    TuesPechkin

    https://www.nuget.org/packages/TuesPechkin/

    or Especially For MVC Web Applications (But I think you may use it in any .net application)

    Rotativa

    https://www.nuget.org/packages/Rotativa/

    They both utilize the wkhtmtopdf binary for converting html to pdf. Which uses the webkit engine for rendering the pages so it can also parse css style sheets .

    They provide easy to use seamless integration with C#.

    Rotativa can also generate directly PDFs from any Razor View.

    Additionally for real world web applications they also manage thread safety etc...


    Most HTML to PDF converter relies on IE to do the HTML parsing and rendering. This can break when user updates their IE. Here is one that does not rely on IE.

    The code is something like this:

    EO.Pdf.HtmlToPdf.ConvertHtml(htmlText, pdfFileName);
    

    Like many other converters, you can pass text, file name, or Url. The result can be saved into a file or a stream.

    链接地址: http://www.djcxy.com/p/46752.html

    上一篇: 上传到Oracle DB之前和之后的文件不一致

    下一篇: 在.NET中将HTML转换为PDF