multibyte

How to detect and echo the last vowel in a word?

只愿长相守 提交于 2019-12-19 05:10:03
问题 $word = "Acrobat" (or Apple, Tea etc.) How can I detect and echo the last vowel of a given word with php? I tried preg_match function, google'd for hours but couldn't find a proper solution. There can be multibyte letters like ü, ö in the string. 回答1: Here's a multibyte safe version of catching the last vowel in a string. $arr = array( 'Apple','Tea','Strng','queue', 'asartä','nő','ağır','NOËL','gør','æsc' ); /* these are the ones I found in character viewer in Mac so these vowels can be

Why use multibyte string functions in PHP?

≯℡__Kan透↙ 提交于 2019-12-18 15:31:59
问题 At the moment, I don't understand why it is really important to use mbstring functions in PHP when dealing with UTF-8? My locale under linux is already set to UTF-8, so why doesn't functions like strlen , preg_replace and so on don't work properly by default? 回答1: All of the PHP string functions do not handle multibyte strings regardless of your operating system's locale. That is why you need to use the multibyte string functions. From the Multibyte String Introduction: When you manipulate

Split a sentence into separate words

孤街浪徒 提交于 2019-12-18 10:55:19
问题 I need to split a Chinese sentence into separate words. The problem with Chinese is that there are no spaces. For example, the sentence may look like: 主楼怎么走 (with spaces it would be: 主楼 怎么 走 ). At the moment I can think of one solution. I have a dictionary with Chinese words (in a database). The script will: try to find the first two characters of the sentence in the database ( 主楼 ), if 主楼 is actually a word and it's in the database the script will try to find first three characters ( 主楼怎 ).

Detect chinese (multibyte) character in the string

≡放荡痞女 提交于 2019-12-18 01:08:16
问题 $str = "This is a string containing 中文 characters. Some more characters - 中华人民共和国 "; How do I detect chinese characters from this string and print the part which starts with the first character and ends with "-"? (it would be "中文 characters. Some more characters -"). Thank you! 回答1: I've solved this problem using preg_match and regular expressions: $str = "This is a string containing 中文 characters. Some more characters - 中华人民共和国 "; preg_match(/[\x{4e00}-\x{9fa5}]+.*\-/u, $str, $matches); 回答2:

How can I tell if a string contains multibyte characters in Javascript?

一曲冷凌霜 提交于 2019-12-17 15:28:11
问题 Is it possible in Javascript to detect if a string contains multibyte characters? If so, is it possible to tell which ones? The problem I'm running into is this (apologies if the Unicode char doesn't show up right for you) s = "𝌆"; alert(s.length); // '2' alert(s.charAt(0)); // '��' alert(s.charAt(1)); // '��' Edit for a bit of clarity here (I hope) . As I understand it now , all strings in Javascript are represented as a series of UTF-16 code points, which means that regular characters

Multibyte trim in PHP?

爱⌒轻易说出口 提交于 2019-12-17 09:29:11
问题 Apparently there's no mb_trim in the mb_* family, so I'm trying to implement one for my own. I recently found this regex in a comment in php.net: /(^\s+)|(\s+$)/u So, I'd implement it in the following way: function multibyte_trim($str) { if (!function_exists("mb_trim") || !extension_loaded("mbstring")) { return preg_replace("/(^\s+)|(\s+$)/u", "", $str); } else { return mb_trim($str); } } The regex seems correct to me, but I'm extremely noob with regular expressions. Will this effectively

How to make jscon_encode work with multibyte characters?

拈花ヽ惹草 提交于 2019-12-13 10:28:43
问题 echo '<a title=' .json_encode("按时间先后进行排序") . '>test</a>'; The above will generate something like "\u6309\u65f6\u95f4\u5148\u540e\u8fdb\u884c\u6392\u5e8f" and it's a mess! 回答1: No, that’s JSON. JSON encoders are free to copy characters as-is (except for doublequote, backslash, or control characters) or to encode them using the \uxxxx notation. So even while the above is not beautiful, it’s valid JSON and will ensure that the string will be decoded correctly. 回答2: The title attribute value is

What is best way to test uppercase or lowercase type of a given character in php?

和自甴很熟 提交于 2019-12-12 19:10:58
问题 What is an ideal way to detected if a character is uppercase or lowercase, regardless of the fact of the current local language. Is there a more direct function? Assumptions: Set internal character encoding to UTF-8 & Local browser session is en-US,en;q=0.5 & Have installed Multibyte String extension. Do not use ctype_lower, or ctype_upper. See below test code that should be multibyte compatible. $encodingtype = 'utf8'; $charactervalue = mb_ord($character, $encodingtype); $characterlowercase

MySQL WHERE `character` = 'a' is matching a, A, Ã, etc. Why?

落爺英雄遲暮 提交于 2019-12-12 09:10:41
问题 I have the following query in MySQL: SELECT id FROM unicode WHERE `character` = 'a' The table unicode contains each unicode character along with an ID (it's integer encoding value). Since the collation of the table is set to utf8_unicode_ci, I would have expected the above query to only return 97 (the letter 'a'). Instead, it returns 119 rows containing the IDs of many 'a'-like letters: a A Ã ... It seems to be ignoring both case and the multi-byte nature of the characters. Any ideas? 回答1: As

Replace String only if string is search (preg_replace multibyte)

六月ゝ 毕业季﹏ 提交于 2019-12-12 03:35:33
问题 I have a problem. I want to replace certain strings only if they are exactly like I typed. So if there is a string with 5 Eur he should only be replaced with e.g. Steam 5 Euro , if he stands alone and not if the string is like How are you 5 Eur pls . With my actual code this is not possible... I use e.g.: $string = str_replace('Apple Itunes 25 Euro Guthaben Prepaid De', 'Apple iTunes 25 Euro', $string) Because here the string contains 25 Eur this code is also adding some stuff: $string = str