While going through the codebase of a former client, I found that I was using four different ways for showing HTML on Android. This made me realize that I didn’t really know what the correct way of showing them is and I’ve always hammered my way using trial & error until the HTML tags have showed up.
Embarrassed, I decided to explore how HTML parsing internally works and learned that they’re actually fairly simple to understand, even for someone who doesn’t understand how text is drawn on the canvas.
I’m fairly certain that I’m not alone in this and we can confirm this by taking a small quiz. Here are 4 potential ways of showing HTML on TextView. Your job is to guess their output — whether the HTML characters will show up or not.
1. Using TextView#setText()
<em>// strings.xml:</em>
<string name="what_the_html"><b>What</b> <i>the</i> <u>Html</u></string>
<em>// Activity.java:</em>
textView.setText(R.string.what_the_html);
2. Using Resources#getString()
textView.setText(getString(R.string.what_the_html));
3. Using Html.fromHtml()
textView.setText(
Html.fromHtml(
getString(R.string.what_the_html)
)
);
4. Using Html.fromHtml() + CDATA
<em>// strings.xml:</em>
<string name="what_the_html">
<strong><![CDATA[</strong>
<b>What</b> <i>the</i> <u>Html</u>
<strong>?]]></strong>
</string>
// Activity.java:
</em>textView.setText(
Html.fromHtml(
getString(R.string.what_the_html)
)
);
If you failed to guess that only #1 and #4 will correctly show HTML, then you may continue reading my findings and learn what is happening.
Problem: String vs CharSequence
The first step in my understanding was to learn that TextView#setText()
accepts a CharSequence
whereas Resources#getString()
returns a String and that these two classes are not always interchangeable. The reason we’re still able to pass a String
is because it implements CharSequence
.
The other implementation of CharSequence that we need to instead use is Spanned
, which supports modifying the visual representation of text by using something known as “spans”. Spans are tiny objects that contain information about how a piece of text should be drawn and Android uses them heavily across the framework. Some common examples include:
Fun fact #1: The blinking cursor you see in an EditText is implemented using a span.
Fun fact #2: Spans are also used for highlighting text, when we long press on a TextView or an EditText.
Solution
Now when a piece of HTML has to be parsed, Android uses the same span objects for converting the HTML tags into a format that TextView can understand and draw on screen. In order to generate these, Android offers us two options:
1. Resources#getText()
Resources#getText()
parses all HTML tags in a string resource and returns a “styled” CharSequence object. This should be the preferred way whenever HTML has to be displayed from a string resource and the resource ID cannot be used:
CharSequence styledText = getText(R.string.what_the_html);
textView.setText(styledText);
The reason why using Resources#getString()
is wrong can also be understood by reading its source. It internally calls Resources#getText()
and converts it to a String, throwing away all the styling:
// Resources.java:
public String getString(@StringRes int resId) {
return getText(resId).toString();
}
Resources#getText()
is also what TextView#setText(@StringRes int)
internally uses when a resource ID is passed instead of a String.
2. Html.fromHtml()
Html.fromHtml()
is what Resources#getText()
internally uses for parsing HTML tags. This can directly be used in cases where HTML is dynamically generated. Similar to Resources#getText()
, this method returns a Spanned
object:
// Html.java:
public static Spanned fromHtml(String source) {
...
}
To conclude: use Resources#getText()
instead of Resources#getString()
for displaying HTML from a String resource file and Html.fromHtml()
for HTML from Java source code.
If you have any questions about spans (or life in general) or something to share, feel free to @ me on Twitter.