Can nginx re-encode XML documents, or alter XML headers?

耗尽温柔 提交于 2020-01-03 03:35:07


I have a problem ultimately caused by a third party XML document whose actual encoding (ISO 8859-1 or Windows 1252, can't tell) doesn't match its declared encoding (UTF-8).

I'm looking for creative workarounds. We already use nginx proxies for various content, so perhaps there is a way to either:

  1. Re-encode the document contents on the fly from ISO 8859-1 to UTF-8; or
  2. Alter the document header on the fly, from UTF-8 to ISO 8859-1.

Are either of these possible with nginx? If not, a similar tool?


Short answer, yes it can.

include win-utf;
server {
  listen 5080;
  location /... {
    source_charset windows-1251;
    charset        utf-8;

That is:

  • source_charset specifies what you're converting from
  • charset specifies what you're converting to
  • and include win-utf brings in a file with a charset_map which does the conversion.

Only conversions between Windows 1251, UTF-8 and KOI8-R are supported out of the box.

More info:

