Android get text from html

I get a special html code:

&lt ;p &gt ;This is &lt ;a href=&quot ;http://www.test.hu&quot ;&gt ;a test link&lt ;/a&gt ; and this is &amp ;nbsp;a sample text with special char: &amp ;#233;va &lt ;/p&gt ;

(There isn't space before ; char, but if I don't insert space the stackoverflow format it)

It's not a normally html code, but if I paste in a empty html page, the browser show it with normal tags:

<p>This is <a href="http://www.test.hu">a test link</a> and this is a sample text with special char: éva </p>

This code will be shown in a browser:

This is a test link And this is a sample text with special char: éva

So I want to get this text, but I can't use Html.fromHtml , because the component what I use doesn't support Spanned . I wanted to try StringEscapeUtils , but I couldn't import it.

How can I replace special chars and remove tags?


Write a parser, no different than you would in any other situation where you have to parse data.

Now, if you can get it as ordinary unescaped HTML, there are a variety of open source Java HTML parsers out there that you can use. If you are going to work with the escaped HTML as you have in your first example, you will have to write the parser yourself.


I guess I am too late to answer Robertoq's question, but I am sure many other guys are still struggeling with this issue, I was one of them.

Anyway, the easiest way I found is this: In strings.xml , add your html code inside CDATA , and then in the activity retrieve the string and load it in WebView , here is the example:

in strings.xml:

<string name="st1"><![CDATA[<p>This is <a href="http://www.test.hu">a test link</a> and this is  a sample text with special char: éva </p>]]>
</string>

you may wish to replace é with &eacute ; (note: there is no space between &eacute and the ; )

Now, in your activity, create WebView and load string st1 to it:

WebView mWebview = (WebView)findViewById(R.id.*WebViewControlID*);
mWebview.loadDataWithBaseURL(null, getString(R.string.st1), "text/html", "utf-8", null);

And horraaa, it should work correctly. If you find this post useful I will be greatful if you can mark it as answered, so we help other struggling with this issue

链接地址: http://www.djcxy.com/p/2560.html

上一篇: 使用jQuery刷新(重新加载)页面一次?

下一篇: Android从html获取文本