java-顶级程序员

Java Unescaping XML/HTML before JAXB parsing doesn't work

Can anyone help me? In HTML/XML: A numeric character reference refers to a character by its Universal Character Set/Unicode code point, and uses the format: &#nnnn; or &#x hhhh; I have to unescape (convert to unicode) these references before I use the JAXB parser. When I use Apache StringEscapeUtils.unescapeXml() also &amp ; and &gt ; and &lt ; are unescaped, and

2018-06-12 04:48:47

Java在JAXB解析之前无法解决XML / HTML问题

谁能帮我？在HTML / XML中：数字字符引用通过其通用字符集/ Unicode代码点引用字符，并使用以下格式：＆＃NNNN; 或＆＃x hhhh; 在使用JAXB解析器之前，我必须将这些引用unescape（转换为unicode）。当我使用Apache StringEscapeUtils.unescapeXml（）时＆amp; 和＆gt; 和＆lt; 没有转义，而且这不是我想要的，因为解析将会失败。有没有只将＆＃nnnn转换为unicode的库？但是，其余的不会让其他人失望？

2018-06-12 04:48:47

Handling Empty Tags in XML using Sax Parser, Java

I'm Using Sax parser to handle a pre written xml file....i have no way of changing the xml as it is held by another application but need to parse data from it. The Xml file contains a Tag < ERROR_TEXT/> which is empty when no error is occurred. as a result the parser takes the next character after the tag close which is "n". i have tried result.replaceAll("n", &qu

2018-06-12 04:46:44

使用Sax Parser，Java处理XML中的空标签

我使用Sax解析器来处理预先编写的xml文件....我没有办法改变xml，因为它由另一个应用程序持有，但需要从它解析数据。 Xml文件包含一个Tag <ERROR_TEXT />，当没有错误发生时它是空的。结果解析器在标签关闭之后的下一个字符是“ n”。我已经尝试过result.replaceAll（“ n”，“”）; 和result.replaceAll（“ n”，“”）; 如何让sax识别这是一个空标签并将值返回为“”？你没有。 SAX的工作是解析数据，而不是决定数据的

2018-06-12 04:46:43

XML parsing with SAX: how to handle html as text within xml

I get an xml response from an external server. Using some tutorials I got SAX-Parser working. There is a small problem still remaining. Within the response there is eg description tag containing html like this: <description>TitleDescription</description> After parsing description field of my object contains only "<"

2018-06-12 04:45:42

用SAX解析XML：如何在xml中将html作为文本处理

我从外部服务器得到一个xml响应。使用一些教程，我得到了SAX-Parser的工作。还有一个小问题仍然存在。在响应中有例如包含html这样的描述标签： <description>TitleDescription</description> 解析我的对象的描述字段后只包含“<”。有没有可能告诉我的解析器将html处理为纯文本？或者也许有其他可能性来解决这个问题。谢谢。既然你不包含

2018-06-12 04:45:42

SAX handling special characters

I'm trying to parse an XML file with Java and SAX for an android device. I got from the internet and while parsing it I'm getting an ExpatException : not well-formed (invalid token) on the character "é". Is there a way to handle those characters without having to change all the specials characters in the xml file? edit : Here is the part of my code writing the file to my SDc

2018-06-12 04:44:40

SAX处理特殊字符

我正在尝试使用Java和SAX为Android设备解析XML文件。我从互联网上获得，并解析它时，我得到一个ExpatException：对字符“é”没有格式良好（无效标记）。有没有办法处理这些字符，而不必更改XML文件中的所有特殊字符？编辑：这是我的代码写入我的SDcard文件的一部分。 File SDCardRoot = Environment.getExternalStorageDirectory(); File f = new File(SDCardRoot,"edt.xml"); f.createNewFile();

2018-06-12 04:44:40

Stripping Invalid XML characters in Java

I have an XML file that's the output from a database. I'm using the Java SAX parser to parse the XML and output it in a different format. The XML contains some invalid characters and the parser is throwing errors like 'Invalid Unicode character (0x5)' Is there a good way to strip all these characters out besides pre-processing the file line-by-line and replacing them? So far

2018-06-12 04:42:36

在Java中剥离无效的XML字符

我有一个XML文件，它是数据库的输出。我正在使用Java SAX解析器来解析XML并以不同的格式输出它。 XML包含一些无效字符，解析器抛出错误，如'无效的Unicode字符（0x5）' 除了预先逐行处理文件并替换它们之外，是否有一种很好的方法可以去除所有这些字符？到目前为止，我已经遇到了3个不同的无效字符（0x5,0x6和0x7）。这是一个大约4GB的数据库转储，我们将要处理它很多次，所以每次我们得到一个新的转储以运行预

2018-06-12 04:42:36

Make DocumentBuilder.parse ignore DTD references

When I parse my xml file (variable f) in this method, I get an error C:Documents and SettingsjoeDesktopaicpcudevOnlineModulemap.dtd (The system cannot find the path specified) I know I do not have the dtd, nor do I need it. How can I parse this File object into a Document object while ignoring DTD reference errors? private static Document getDoc(File f, String docId) throws Exception{ D

2018-06-12 04:41:35

使DocumentBuilder.parse忽略DTD引用

当我在这个方法中解析我的xml文件（变量f）时，出现错误 C： Documents and Settings joe Desktop aicpcudev OnlineModule map.dtd（系统找不到指定的路径）我知道我没有dtd，也不需要它。我该如何解析这个File对象到一个Document对象中而忽略DTD引用错误？ private static Document getDoc(File f, String docId) throws Exception{ DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance(); D

2018-06-12 04:41:34

DTD parsing with Stax

i want to parse xml files which declare a HTML 4.01 Doctype. <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <html> [...] </html> I using Stax and an XMLResolver for load local dtd XMLInputFactory xmlInputFactory = XMLInputFactory.newInstance(); xmlInputFactory.setXMLResolver(new LocalXmlResolver()); xmlOutputFactory = XMLOutputFacto

2018-06-12 04:40:33

使用Stax进行DTD分析

我想解析声明HTML 4.01文档类型的XML文件。 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <html> [...] </html> 我使用Stax和XMLResolver来加载本地dtd XMLInputFactory xmlInputFactory = XMLInputFactory.newInstance(); xmlInputFactory.setXMLResolver(new LocalXmlResolver()); xmlOutputFactory = XMLOutputFactory.newInstance(); xmlOutputFactory

2018-06-12 04:40:32

Conflict between Spring and XOM

In my Java program, I made a class that uses XOM to read XML files. I am also using Spring. When the line: ApplicationContext ctx = new ClassPathXmlApplicationContext("dataIO-beans.xml"); is executed, I get an exception that includes: javax.xml.parsers.ParserConfigurationException: Unable to validate using XSD: Your JAXP provider [org.apache.xerces.jaxp.DocumentBuilderFactoryImpl@4d4

2018-06-12 04:39:31

Spring与XOM之间的冲突

在我的Java程序中，我创建了一个使用XOM读取XML文件的类。我也在使用Spring。当行： ApplicationContext ctx = new ClassPathXmlApplicationContext("dataIO-beans.xml"); 被执行，我得到一个异常，其中包括： javax.xml.parsers.ParserConfigurationException: Unable to validate using XSD: Your JAXP provider [org.apache.xerces.jaxp.DocumentBuilderFactoryImpl@4d48f152] does not support XML Schema. A

2018-06-12 04:39:30

DTD download error while parsing XHTML document in XOM

I am trying to parse an HTML document with the doctype declared to use the transitional dtd as follows: <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> When I do Builder.build on the document, I get the following exception: java.io.IOException: Server returned HTTP response code: 503 for URL

2018-06-12 04:36:25

在XOM中解析XHTML文档时DTD下载错误

我试图解析一个HTML文档，声明的doctype使用过渡性dtd，如下所示： <！DOCTYPE html PUBLIC“ - // W3C // DTD XHTML 1.0 Transitional // EN”“http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd”> 当我在文档上执行Builder.build时，出现以下异常： java.io.IOException: Server returned HTTP response code: 503 for URL: http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd at sun.net.ww

2018-06-12 04:36:25

SAX character buffer size

I'm trying to use Sax to parse very large XML files. 100's of megs. The problem is the Parser reads in exactly 2048 characters at a time and terminates. I get a los of tag's value splitted into two parts using the callback "public void characters(...)". For example, the first part is in the character array on position 2044 with length 4 "2013" and the second pa

2018-06-12 04:35:23

SAX字符缓冲区大小

我试图用Sax来解析非常大的XML文件。百万的megs。问题是解析器一次只能读取2048个字符并终止。我使用回调“public void characters（...）”得到了标签值的分解成两部分的问题。例如，第一部分位于长度为4“2013”的位置2044上的字符数组中，第二部分位于长度为6的位置0上的“-09-30”。它应该是日期值“2013-09-30”如果收到一部分。何可以避免这种分裂？任何人都可以帮助我？ public void characters(char[] ch, int

2018-06-12 04:35:23