Is there a faster way to decode html characters to a string than Html.fromHtml()?

♀尐吖头ヾ 提交于 2019-11-27 11:49:54

What about org.apache.commons.lang.StringEscapeUtils's unescapeHtml(). The library is available on Apache site.

(EDIT: June 2019 - See the comments below for updates about the library)

fromHtml() does not have a high-performance HTML parser, and I have no idea how quick the toString() implementation on SpannedString is. I doubt either were designed for your scenario.

Ideally, the strings are clean before they get to a low-power phone. Either clean them up in the build process (for resources/assets), or clean them up on a server (before you download them).

If, for whatever reason, you absolutely need to clean them up on the device, you can perhaps use the NDK to create a C/C++ library that does the cleaning for you faster.

This is an incredibly fast and simple option: Unbescape

It greatly improved our parsing performance which requires every string to be run through a decoder.

FrinkTheBrave

Have you looked at Strip HTML from Text JavaScript

With a large batch of these it can add over a minute

Any parsing will take some time. 22ms seems to me like fast. Anyway, can you do it in background? Can help you some kind of caching?

Although I have not tried them yet, I found some possible solutions:

  1. HTML Java Parsers
  2. HTML Parsing
  3. More HTML Parsing

I hope it helps.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!