Make DocumentBuilder.parse ignore DTD references
When I parse my xml file (variable f) in this method, I get an error
C:Documents and SettingsjoeDesktopaicpcudevOnlineModulemap.dtd (The system cannot find the path specified)
I know I do not have the dtd, nor do I need it. How can I parse this File object into a Document object while ignoring DTD reference errors?
private static Document getDoc(File f, String docId) throws Exception{
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(f);
return doc;
}
A similar approach to the one suggested by @anjanb
builder.setEntityResolver(new EntityResolver() {
@Override
public InputSource resolveEntity(String publicId, String systemId)
throws SAXException, IOException {
if (systemId.contains("foo.dtd")) {
return new InputSource(new StringReader(""));
} else {
return null;
}
}
});
I found that simply returning an empty InputSource worked just as well?
Try setting features on the DocumentBuilderFactory:
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setValidating(false);
dbf.setNamespaceAware(true);
dbf.setFeature("http://xml.org/sax/features/namespaces", false);
dbf.setFeature("http://xml.org/sax/features/validation", false);
dbf.setFeature("http://apache.org/xml/features/nonvalidating/load-dtd-grammar", false);
dbf.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);
DocumentBuilder db = dbf.newDocumentBuilder();
...
Ultimately, I think the options are specific to the parser implementation. Here is some documentation for Xerces2 if that helps.
I found an issue where the DTD file was in the jar file along with the XML. I solved the issue based on the examples here, as follows: -
DocumentBuilder db = dbf.newDocumentBuilder();
db.setEntityResolver(new EntityResolver() {
public InputSource resolveEntity(String publicId, String systemId) throws SAXException, IOException {
if (systemId.contains("doc.dtd")) {
InputStream dtdStream = MyClass.class
.getResourceAsStream("/my/package/doc.dtd");
return new InputSource(dtdStream);
} else {
return null;
}
}
});
链接地址: http://www.djcxy.com/p/34904.html
上一篇: 在Java中剥离无效的XML字符