Filter \n character from inputstream
I trying to parse xml from an inputstream using the sax parser. The inputstream get incoming xml continously from a socket. 'n' is used as a delimiter between xml data. This is how the xml would look like
<?xml version="1.0" encoding="UTF-8"?>
<response processor="header" callback="comheader">
<properties>
<timezone>Asia%2FBeirut</timezone>
<rawoffset>7200000</rawoffset>
<to_date>1319256000000</to_date>
<dstrawoffset>10800000</dstrawoffset>
</properties>
</response>
n
<event type="progress" time="1317788744214">
<param key="callback">todayactions</param>
<param key="percent">10</param>
<param key="msg">MAPPING</param>
</event>
<event type="progress" time="1317788744216">
<param key="callback">todayactions</param>
<param key="percent">20</param><param key="msg">MAPPING</param>
</event>
n
<?xml version="1.0" encoding="UTF-8"?>
<response processor="header" callback="comheader">
<properties>
<timezone>Asia%2FBeirut</timezone>
<rawoffset>7200000</rawoffset>
<to_date>1319256000000</to_date>
<dstrawoffset>10800000</dstrawoffset>
</properties>
</response>
This worked perfectly for the our iphone project as we took the characters upto n and stored that in a string and used the dom parser.
But when I tried to do this for the android, string was not an option as it gave us OutOfMemory exception. So we set the inputstream directly to the SaxParser it works until the n character, after that it gives us the exception
org.apache.harmony.xml.ExpatParser$ParseException: At line 2, column 0: junk after document element
So I tried to filter the inputstream to skip the 'n' character . I created a FilterStreamReader but I was not successful, it seems my read function isn't doing the job. Here is my code.
public class FilterStreamReader extends InputStreamReader {
public FilterStreamReader(InputStream in, String enc)
throws UnsupportedEncodingException {
super(in, enc);
}
@Override
public int read(char[] cbuf, int off, int len) throws IOException {
int read = super.read(cbuf, off, len);
Log.e("Reader",Character.toString((char)read));
if (read == -1) {
return -1;
}
int pos = off - 1;
for (int readPos = off; readPos < off + read; readPos++) {
if (read == 'n') {
pos++;
} else {
continue;
}
if (pos < readPos) {
cbuf[pos] = cbuf[readPos];
}
}
return pos - off + 1;
}
Can someone help me filter the n of an inputstream?
Edit Based on what graham said I was able to parse the whole data by removing all the doc types and adding my own start and end tag. So Im not really sure that my problem is not filtering 'n' alone. How can you parse xml that keeps coming like this?
The problem isn't the n
. It's that after the first </response>
tag, it thinks the document is complete.
This data isn't valid XML. You should wrap everything inside a single top-level node. Also, you can't have a second <?xml version="1.0" encoding="UTF-8"?>
declaration part-way through the document.
上一篇: 解析XML时忽略DTD
下一篇: 从输入流过滤\ n字符