Arabic Character Encoding Issue: UTF-8 versus Windows-1256

前端 未结 4 2138
野趣味
野趣味 2021-01-05 11:10

Quick Background: I inherited a large sql dump file containing a combination of english and arabic text and (I think) it was originally exported using \'lat

4条回答
  •  情歌与酒
    2021-01-05 11:25

    We can't find the error in your code if you don't show us your code, so we're very limited in how we can help you.

    You told the browser to interpret the document as being UTF-8 rather than Windows-1256, but did you actually change the encoding used from Windows-1256 to UTF-8?

    For example,

    $ cat a.pl
    use strict;
    use warnings;
    use feature qw( say );
    use charnames ':full';
    
    my $enc = $ARGV[0] or die;
    binmode STDOUT, ":encoding($enc)";
    
    print <<"__EOI__";
    
    
    
    Foo!
    
    
    \N{ARABIC LETTER ALEF}\N{ARABIC LETTER LAM}\N{ARABIC LETTER AIN}\N{ARABIC LETTER REH}\N{ARABIC LETTER BEH}\N{ARABIC LETTER YEH}\N{ARABIC LETTER TEH MARBUTA}
    
    
    __EOI__
    
    $ perl a.pl UTF-8 > utf8.html
    
    $ perl a.pl Windows-1256 > cp1256.html
    

提交回复
热议问题