Fix incorrectly displayed encoding on an html document with php

Is there a way to fix the characters that display improperly after running this html markup through phpquery::newDocument? There are slated double quotes around -Classics with modern Woman- in the original document that end up displaying improperly after creating the new doc with phpquery.

    //Original document is UTF-8 encoded
$raw_html = '<html><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /></head><body><p>Mr. Smith of Bangkok celebrated the “Classics with modern Woman”.</p></body></html>';
print($raw_html);

$aNew_document = phpQuery::newDocument($raw_html);
print($aNew_document);

Original Output: Mr. Smith of Bangkok celebrated the “Classics with modern Woman”.

New Document Output: Mr. Smith of Bangkok celebrated the Classics with modern Woman.


  • You need to save the page with UTF-8 without BOM encoding.
  • Add this header on top of your script:

    header("Content-Type: text/html; charset=UTF-8");

  • [EDIT]: How to Save Files as UTF-8 without BOM :

    On OP request, here's how you can do on Windows:

  • Download Notepad++. It is an awesome text-editor that you should be using.
  • Install it.
  • open the PHP script in Notepad++ that contains this code. The page where you are doing all the coding. Yes, that file on your computer.
  • In Notepad++, from the Encoding menu at the top, select "Convert to UTF-8 without BOM".
  • Save the file.
  • Upload to your webserver by FTP or whatever you use.
  • Now, run that script.

  • i had the same problem but when i added

    ob_start();
    

    to first line

    ob_end_flush();
    

    to the end it seem to be working


    You have this in the <head> element:

    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/> 
    

    The next course would be to use HTML entities to display these characters.

    链接地址: http://www.djcxy.com/p/34674.html

    上一篇: 使用awk删除字节

    下一篇: 修复了在PHP文件中错误地显示编码的问题