latin

Latin Regex with symbols

帅比萌擦擦* 提交于 2021-02-05 05:51:27
问题 I need split a text and get only words, numbers and hyphenated composed-words. I need to get latin words also, then I used \p{L} , which gives me é, ú ü ã, and so forth. The example is: String myText = "Some latin text with symbols, ? 987 (A la pointe sud-est de l'île se dresse la cathédrale Notre-Dame qui fut lors de son achèvement en 1330 l'une des plus grandes cathédrales d'occident) : ! @ # $ % ^& * ( ) + - _ #$% " ' : ; > < / \ | , here some is wrong… * + () e -" Pattern pattern =

Latin Regex with symbols

爷,独闯天下 提交于 2021-02-05 05:51:06
问题 I need split a text and get only words, numbers and hyphenated composed-words. I need to get latin words also, then I used \p{L} , which gives me é, ú ü ã, and so forth. The example is: String myText = "Some latin text with symbols, ? 987 (A la pointe sud-est de l'île se dresse la cathédrale Notre-Dame qui fut lors de son achèvement en 1330 l'une des plus grandes cathédrales d'occident) : ! @ # $ % ^& * ( ) + - _ #$% " ' : ; > < / \ | , here some is wrong… * + () e -" Pattern pattern =

How to print Latin characters to the C++ console properly on Windows?

夙愿已清 提交于 2021-02-05 04:48:30
问题 I'm having a problem writing French characters to the console in C++. The string is loaded from a file using std::ifstream and std::getline and then printed to the console using std::cout . Here is what the string is in the file: La chaîne qui correspond au code "TEST_CODE" n'a pas été trouvée à l'aide locale "fr". And here is how the string is being printed: La cha¯ne qui correspond au code "TEST_CODE" n'a pas ÚtÚ trouvÚe Ó l'aide locale "fr". How can I fix this problem? 回答1: The issue is

Unicode letters with more than 1 alphabetic latin character?

我只是一个虾纸丫 提交于 2020-02-06 18:59:31
问题 I'm not really sure how to express it but I'm searching for unicode letters which are more than one visual latin letter. I found this in Word so far: DZ Dz dz NJ Lj LJ Nj nj Any others? 回答1: Sorry about the formatting because it's hard to map long characters to monospace fonts' letter widths. It would be better if it's in a picture but then there's no possibility to copy and zoom infinitely Digraphs +-------------+----------+-----------------------+-------------------------+ | Two Glyphs | Digraph |

Non latin symbols in url, php

廉价感情. 提交于 2020-01-05 08:36:55
问题 If use in url, non allowed character, for example space: <a href="pa ge.php">link</a> and click this link, in browser addres bar I see mysite.com/pa%20ge okay, and if now I use georgian, (or for example russian) alphabet symbols: <a href="აბცდ.php">link</a> In in browser addres bar, I see mysite/აბცდ.php that is, these non latine alphabet symbols, are not changed, tey are in url "presented" as original view. question: Why? non latine alphabet symbols are also allowed in url ? 回答1: No, a URL

Non latin symbols in url, php

不羁岁月 提交于 2020-01-05 08:35:23
问题 If use in url, non allowed character, for example space: <a href="pa ge.php">link</a> and click this link, in browser addres bar I see mysite.com/pa%20ge okay, and if now I use georgian, (or for example russian) alphabet symbols: <a href="აბცდ.php">link</a> In in browser addres bar, I see mysite/აბცდ.php that is, these non latine alphabet symbols, are not changed, tey are in url "presented" as original view. question: Why? non latine alphabet symbols are also allowed in url ? 回答1: No, a URL

Concat arabic and english string with string.Format()

你说的曾经没有我的故事 提交于 2019-12-24 07:13:23
问题 Have some trouble with concat two string. return string.Format("{0}{1}{2}", IdWithSubType, ExtraInfo.Any(info => info.InfoType == UniExtraInfoType.Alias) ? string.Format(" ({0})", string.Join(",", ExtraInfo.First(info => info.InfoType == UniExtraInfoType.Alias).Info)) : "", Context != null ? string.Format(" ({0})", Context.IdWithSubType) : ""); it's ok when IdWithSubType, extrainfo and context has latin or kirillic symbols, but IdWithSubType can be arabic, and concat with that is wrong. e.g

Lowercase of Unicode character

对着背影说爱祢 提交于 2019-12-22 05:20:14
问题 I am working on a C++ project that need to get data from unicode text . I have a problem that I can't lower some unicode character . I use wchar_t to store unicode character which read from a unicode file. After that, I use _wcslwr to lower a wchar_t string. There are many case still not lower such as: Đ Â Ă Ê Ô Ơ Ư Ấ Ắ Ế Ố Ớ Ứ Ầ Ằ Ề Ồ Ờ Ừ Ậ Ặ Ệ Ộ Ợ Ự which lower case is: đ â ă ê ô ơ ư ấ ắ ế ố ớ ứ ầ ằ ề ồ ờ ừ ậ ặ ệ ộ ợ ự I have try tolower and it is still not working. 回答1: If you call only

How to compare and output latin characters?

柔情痞子 提交于 2019-12-13 20:34:22
问题 I have an array of countries with one having a Latin character "Å": $country["af"] = "Afghanistan"; $country["ax"] = "Åland Islands"; $country["al"] = "Albania"; While looping through this array and performing a comparison of the first character of the country name, I cannot match the Latin character. foreach($country as $cc => $name) { if($name[0] == "Å") { echo "matched"; } else { echo $name[0]; } } The result I got is: A�A Why does the Latin character Å became � and how do I perform a

Pig ORDER command fails

风流意气都作罢 提交于 2019-12-13 07:45:47
问题 I am trying to analyze an apache log and the goal is the find out all user agents and their percentage in usage. The following program works fine to the line when result contains each useragent, count and percentage. The program fails at last line when tries to order according to most used. Could someone help? logs = LOAD '$LOGS' USING ApacheCombinedLogLoader AS (remoteHost, hyphen, user, time, method, uri, protocol, statusCode, responseSize, referer, userAgent); uarows = FOREACH logs