ColdFusion special unicode characters in the content returned by cfhttp
In the content retrieved with ColdFusion http object there are some characters that are returned as question marks; namely these are roman numerals (like Ⅱ) which are displayed without problems when I visit the same page with a browser.
The server where I make request to dose not seem to provide any charset information in the response headers (the value of Content-Type is just "text/html" and charset property in the result of cfhttp is blank), but the encoding is declared in page's html as "charset=EUC-JP" (it is a page in Japanese). So I make request with charset set to EUC-JP.
The content in Japanese (Japanese characters) is retrieved correctly, but the roman numerals are turned into question marks.
I tried requesting with charset set to UTF-8, but in this case everything gets scrambled. To me it seems that those roman numerals are Unicode, so my understanding is that the server where I make request to mixes encodings (but I maybe wrong about this).
How do I get those special characters to display correctly in the fileContent of cfhttp?
Thanks!
The only way I can think of is to make 2 requests with the different encodings and the merge the data together. The first request would be for charset of EUC-JP and the second would be with UTF 8. After the second request look through the content from the first and for every question mark, look up the index in the second request. For example, when you hit the 5th question mark in the first set of content, look for the 5th roman numeral in the second set. It's not guaranteed to work, but it's all I can think of.
链接地址: http://www.djcxy.com/p/31222.html