How to Strip or Escape HTML Tags in Android

How to strip or escape html tags in Android

The solutions in the answer linked to by @sparkymat generally require either regex - which is an error-prone approach - or installing a third-party library such as jsoup or jericho. A better solution on Android devices is just to make use of the Html.fromHtml() function:

public String stripHtml(String html) {
if (android.os.Build.VERSION.SDK_INT >= android.os.Build.VERSION_CODES.N) {
return Html.fromHtml(html, Html.FROM_HTML_MODE_LEGACY).toString();
} else {
return Html.fromHtml(html).toString();
}
}

This uses Android's built in Html parser to build a Spanned representation of the input html without any html tags. The "Span" markup is then stripped by converting the output back into a string.

As discussed here, Html.fromHtml behaviour has changed since Android N. See the documentation for more info.

how to remove Html tag in android?


       html = html.replaceAll("<(.*?)\\>"," ");//Removes all items in brackets
html = html.replaceAll("<(.*?)\\\n"," ");//Must be undeneath
html = html.replaceFirst("(.*?)\\>", " ");//Removes any connected item to the last bracket
html = html.replaceAll(" "," ");
html = html.replaceAll("&"," ");

Here is a piece of my code.

Databinding remove html tags from string

You could try and use HTML escape codes:

<string name="underlined_text">This is a <u>underlined</u> text.</string>

I'd also question whether databinding is really required here - you can just use android:text="@string/underlined_text"

Edit: Also came across this answer which could be of use to you

Remove HTML tags from a String

Use a HTML parser instead of regex. This is dead simple with Jsoup.

public static String html2text(String html) {
return Jsoup.parse(html).text();
}

Jsoup also supports removing HTML tags against a customizable whitelist, which is very useful if you want to allow only e.g. <b>, <i> and <u>.

See also:

  • RegEx match open tags except XHTML self-contained tags
  • What are the pros and cons of the leading Java HTML parsers?
  • XSS prevention in JSP/Servlet web application

Android: Strip all html except for img tags

This is a bit hacky but it does the job:

  • Substitute all img tags for some special string so the stripping function can't see them
  • Strip out all HTML
  • Substitute the special string for the img tags to get them back.

    String stripHTMLtagsExceptIMG(String htmlString)
    {
    String subbed = htmlString.replaceAll("< *[iI][mM][gG]", "_iimmgg");
    String stripped = android.text.Html.fromHtml(subbed).toString();
    String unsubbed = stripped.replaceAll("_iimmgg", "<img");

    return unsubbed;
    }

Remove HTML tags while reading XML data on Android

Find your answer here:
How to strip or escape html tags in Android

In short Html.fromHtml(stringToEscape).toString()

Android formatting xml string loses its html tags

use Html.fromHtml

builder.setMessage(Html.fromHtml(text));

when you apply the formatting, the CharSequence is converted back to String, and you need the Spannable with the html information.

From the doc:

Sometimes you may want to create a styled text resource that is also
used as a format string. Normally, this won't work because the
String.format(String, Object...) method will strip all the style
information from the string. The work-around to this is to write the
HTML tags with escaped entities, which are then recovered with
fromHtml(String), after the formatting takes place.

try with <b> in place of <b> and with </b> in place </b>



Related Topics



Leave a reply



Submit