Weird behavior when downloading html using HttpURLConnection
In my Wikipedia reader app for Android, I'm downloading an article's html by using HttpURLConnection, some users report that they are unable to see articles, instead they see some css, so it seems like their carrier is somehow preprocessing the html before it's downloaded, while other wikipedia readers seem to work fine.
Example url: http://en.m.wikipedia.org/wiki/Black_Moon_(album)
My method:
public static String downloadString(String url) throws Exception
{
StringBuilder downloadedHtml = new StringBuilder();
HttpURLConnection urlConnection = null;
String line = null;
BufferedReader rd = null;
try
{
URL targetUrl = new URL(url);
urlConnection = (HttpURLConnection) targetUrl.openConnection();
if (url.toLowerCase().contains("/special"))
urlConnection.setInstanceFollowRedirects(true);
else
urlConnection.setInstanceFollowRedirects(false);
//read the result from the server
rd = new BufferedReader(new InputStreamReader(urlConnection.getInputStream()));
while ((line = rd.readLine()) != null)
downloadedHtml.append(line + 'n');
}
catch (Exception e)
{
AppLog.e("An exception occurred while downloading data.rn: " + e);
e.printStackTrace();
}
finally
{
if (urlConnection != null)
{
AppLog.i("Disconnecting the http connection");
urlConnection.disconnect();
}
if (rd != null)
rd.close();
}
return downloadedHtml.toString();
}
I'm unable to reproduce this problem, but there must be a way to get around that? I even disabled redirects by setting setInstanceFollowRedirects to 'false' but it didn't help.
Am I missing something?
Example of what the users are reporting:
http://pastebin.com/1E3Hn2yX
carrier is somehow preprocessing the html before it's downloaded
a way to get around that?
Use HTTPS to prevent carriers from rewriting pages. (no citation)
Am I missing something?
not that I can see
链接地址: http://www.djcxy.com/p/78510.html