I have just set about the task of stripping out HTML entities from our database, as we do a lot of crawling and some of the crawlers didn\'t do this at input time :(
This is what it took for me to get working on Ubuntu 18.04 with PG10, and Perl didn't decode some entities like , for some reason. So I used Python3.
From the command line
sudo apt install postgresql-plpython3-10
From your SQL interface:
CREATE LANGUAGE plpython3u;
CREATE OR REPLACE FUNCTION htmlchars(str TEXT) RETURNS TEXT AS $$
from html.parser import HTMLParser
h = HTMLParser()
if str is None:
return str
return h.unescape(str);
$$ LANGUAGE plpython3u;