Forcing an ANSI encoding on string (CP1252, ISO8859), obtaining UTF-8 encoding when force download it

给你一囗甜甜゛ 提交于 2019-12-25 02:29:04

问题


If I cast this on my starting string:

echo mb_detect_encoding($string);

I get the value:

ASCII

This string will be downloaded. I suppose it's UTF-8, as that's the default encoding for PHP as well as from the database. Its file extension will be .DAT, and I have already added it to config/mimes.php this way:

'DAT' => 'text/plain; charset=ISO-8859-1'

Then if I try to download that string, using the download helper of Codeigniter (assume I have already loaded the helper)

force_download('MYFILE.DAT', $string);

Debugging via F12, response headers are the following:

Content-Disposition:attachment; filename="MYFILE.DAT"
Content-Length:21024
Content-Transfer-Encoding:binary
Content-Type:"text/plain; charset=ISO-8859-1"

But when I open this file in Notepad++, it appears to be encoded in UTF-8 without BOM.

I have even tried to apply iconv, as well as mb_convert_encoding as if the string was in utf-8, (even though mb_detect_encoding told me the string was ASCII):

iconv("UTF-8", "ISO-8859-1", $string);
iconv("UTF-8", "CP1252", $string);
/* ... and so on ... */

Also tried

mb_convert_encoding($string, "ISO-8859-1");
mb_convert_encoding($string, "CP1252");
/* ... and so on ... */

But obviously, had same results. String appears to be ISO-8859 if I do a var_dump (accents are messed up) but after downloading, seems to be still encoded in UTF-8 (accents are back again!)

What am I missing here? What am I doing wrong? Should I write the file before and then force download it?

SOLVED :

The problem was the starting charset. It appeared to be in ISO-8859-1. Despite @deceze was absolutely right about the fact you can't specify the encoding in a plain text file, you actually are able to encode its contents, I mean, the characters.


回答1:


You cannot detect encodings with any measure of consistency or accuracy. An ASCII file is just as valid in ISO-8859 or UTF-8 or any other ASCII-compatible encoding. PHP defaults to calling it ASCII, Notepad++ defaults to calling it UTF-8. Both decisions are equally valid. Since the "actual" encoding is not stored anywhere in the file or with the file's metadata (even if you set HTTP headers), there is no "right" answer.



来源:https://stackoverflow.com/questions/25401231/forcing-an-ansi-encoding-on-string-cp1252-iso8859-obtaining-utf-8-encoding-w

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!