PHP: mb_strtoupper not working

时光怂恿深爱的人放手 提交于 2019-12-05 09:14:37

Instead of strtoupper()/mb_strtoupper() use mb_convert_case() since upper case converting is very tricky across different encodings, also make sure your string IS UTF-8.

$content = 'Le Courrier de Sáint-Hyácinthe';

mb_internal_encoding('UTF-8');
if(!mb_check_encoding($content, 'UTF-8')
    OR !($content === mb_convert_encoding(mb_convert_encoding($content, 'UTF-32', 'UTF-8' ), 'UTF-8', 'UTF-32'))) {

    $content = mb_convert_encoding($content, 'UTF-8'); 
}

// LE COURRIER DE SÁINT-HYÁCINTHE
echo mb_convert_case($content, MB_CASE_UPPER, "UTF-8"); 

Working example: http://3v4l.org/enEfm#v443

See also my comment at the PHP website about the converter: http://www.php.net/manual/function.utf8-encode.php#102382

It works for me, but only when the php file itself is saved as UTF-8 and when the terminal that I'm in expects UTF-8. I think what is happening for you is that the file is saved as ISO-8859-1 and your terminal is expecting ISO-8859-1.

First, mb_detect_encoding doesn't actually work for this string. Even when the PHP file is not UTF-8, it still reports it as UTF-8.

When you print the lower case string, it prints ISO-8859-1 characters and your terminal displays them just fine. Then when you convert to upper case using UTF-8, it gets mangled.

I created two versions of this file. I saved it using my text editor in ISO-8859-1 as iso-8859-1.php. Then I used iconv to convert the entire file to UTF-8 and saved it as utf-8.php

iconv iso-8859-1.php --from iso-8859-1 --to UTF-8 > utf-8.php

I added a line to print the result the encoding that mb_detect_encoding returns.

$ file iso-8859-1.php 
iso-8859-1.php: PHP script, ISO-8859 text

$ php iso-8859-1.php 
ENCODING: UTF-8
DEBUG1 Le Courrier de S�int-Hy�cinthe
DEBUG2 LE COURRIER DE S?INT-HY?CINTHE

$ file utf-8.php 
utf-8.php: PHP script, UTF-8 Unicode text

$ php utf-8.php 
ENCODING: UTF-8
DEBUG1 Le Courrier de Sáint-Hyácinthe
DEBUG2 LE COURRIER DE SÁINT-HYÁCINTHE

My terminal actually expects UTF-8 text, so when I print out ISO-8859-1 text it gets mangled. Everything works correctly when the file is saved as utf-8 and the terminal expects utf-8.

Actually, what works here is simply

<?php
mb_internal_encoding('UTF-8');

$x='Le Courrier de Sáint-Hyácinthe';
echo mb_strtoupper( $x ) . "\n";

outputs

LE COURRIER DE SÁINT-HYÁCINTHE

here it works directly, but maybe in your case you have to add utf8_encode:

$x = utf8_encode( 'Le Courrier de Sáint-Hyácinthe' );

--

An alternative that works here without MB,

<?php
echo strtoupper(str_replace('á', 'Á', 'Le Courrier de Sáint-Hyácinthe'));
标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!