how can i detect hebrew characters both iso8859-8 and utf8 in a string using php

本小妞迷上赌 提交于 2019-11-27 03:35:21

问题


I want to be able to detect (using regular expressions) if a string contains hebrew characters both utf8 and iso8859-8 in the php programming language. thanks!


回答1:


Here's map of the iso8859-8 character set. The range E0 - FA appears to be reserved for Hebrew. You could check for those characters in a character class:

[\xE0-\xFA]

For UTF-8, the range reserved for Hebrew appears to be 0591 to 05F4. So you could detect that with:

[\u0591-\u05F4]

Here's an example of a regex match in PHP:

echo preg_match("/[\u0591-\u05F4]/", $string);



回答2:


well if your PHP file is encoded with UTF-8 as should be in cases that you have hebrew in it, you should use the following RegX:

$string="אבהג";
echo preg_match("/\p{Hebrew}/u", $string);
// output: 1



回答3:


Here's a small function to check whether the first character in a string is in hebrew:

function IsStringStartsWithHebrew($string)
{
    return (strlen($string) > 1 && //minimum of chars for hebrew encoding
        ord($string[0]) == 215 && //first byte is 110-10111
        ord($string[1]) >= 144 && ord($string[1]) <= 170 //hebrew range in the second byte.
        );
}

good luck :)




回答4:


First, such a string would be completely useless - a mix of two different character sets?

Both the hebrew characters in iso8859-8, and each byte of multibyte sequences in UTF-8, have a value ord($char) > 127. So what I would do is find all bytes with a value greater than 127, and then check if they make sense as is8859-8, or if you think they would make more sense as an UTF8-sequence...




回答5:


function is_hebrew($string)
{
    return preg_match("/\p{Hebrew}/u", $string);
}


来源:https://stackoverflow.com/questions/1694350/how-can-i-detect-hebrew-characters-both-iso8859-8-and-utf8-in-a-string-using-php

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!