french

Converting from bytes to French text in Python

牧云@^-^@ 提交于 2021-02-02 02:18:33
问题 I am cleaning the monolingual corpus of Europarl for French (http://data.statmt.org/wmt19/translation-task/fr-de/monolingual/europarl-v7.fr.gz). The original raw data in .gz file (I downloaded using wget ). I want to extract the text and see how it looks like in order to further process the corpus. Using the following code to extract the text from gzip , I obtained data with the class being bytes . with gzip.open(file_path, 'rb') as f_in: print('type(f_in)=', type(f_in)) text = f_in.read()

Converting from bytes to French text in Python

爷,独闯天下 提交于 2021-02-02 02:08:52
问题 I am cleaning the monolingual corpus of Europarl for French (http://data.statmt.org/wmt19/translation-task/fr-de/monolingual/europarl-v7.fr.gz). The original raw data in .gz file (I downloaded using wget ). I want to extract the text and see how it looks like in order to further process the corpus. Using the following code to extract the text from gzip , I obtained data with the class being bytes . with gzip.open(file_path, 'rb') as f_in: print('type(f_in)=', type(f_in)) text = f_in.read()

Converting from bytes to French text in Python

时光毁灭记忆、已成空白 提交于 2021-02-02 02:05:53
问题 I am cleaning the monolingual corpus of Europarl for French (http://data.statmt.org/wmt19/translation-task/fr-de/monolingual/europarl-v7.fr.gz). The original raw data in .gz file (I downloaded using wget ). I want to extract the text and see how it looks like in order to further process the corpus. Using the following code to extract the text from gzip , I obtained data with the class being bytes . with gzip.open(file_path, 'rb') as f_in: print('type(f_in)=', type(f_in)) text = f_in.read()

Python: replace french letters with english

拜拜、爱过 提交于 2020-08-05 08:05:22
问题 Would like to replace all the french letters within words with their ASCII equivalent. letters = [['é', 'à'], ['è', 'ù'], ['â', 'ê'], ['î', 'ô'], ['û', 'ç']] for x in letters: for a in x: a = a.replace('é', 'e') a = a.replace('à', 'a') a = a.replace('è', 'e') a = a.replace('ù', 'u') a = a.replace('â', 'a') a = a.replace('ê', 'e') a = a.replace('î', 'i') a = a.replace('ô', 'o') a = a.replace('û', 'u') a = a.replace('ç', 'c') print letters[0][0] This code prints é however. How can I make this

Python: replace french letters with english

白昼怎懂夜的黑 提交于 2020-08-05 08:05:08
问题 Would like to replace all the french letters within words with their ASCII equivalent. letters = [['é', 'à'], ['è', 'ù'], ['â', 'ê'], ['î', 'ô'], ['û', 'ç']] for x in letters: for a in x: a = a.replace('é', 'e') a = a.replace('à', 'a') a = a.replace('è', 'e') a = a.replace('ù', 'u') a = a.replace('â', 'a') a = a.replace('ê', 'e') a = a.replace('î', 'i') a = a.replace('ô', 'o') a = a.replace('û', 'u') a = a.replace('ç', 'c') print letters[0][0] This code prints é however. How can I make this

French keyboard on macOS, altering the behavior of the tilde ~ key

梦想与她 提交于 2020-06-17 16:20:44
问题 Like most french users, when I want to go to my terminal home, I have to type cd ~ However, the keyboard requires me to press options + n and then space to disambiguate between me trying to do ñ or ~ for example. Is there a way to overload this behavior, as I almost never want to use the tilde symbol the way spanish people does ? Karabiner looked promising but it won't let you define custom mapping. It requires you to chose between a set of predefined ones, online. 回答1: Ukelele can create a

French keyboard on macOS, altering the behavior of the tilde ~ key

孤街醉人 提交于 2020-06-17 16:20:39
问题 Like most french users, when I want to go to my terminal home, I have to type cd ~ However, the keyboard requires me to press options + n and then space to disambiguate between me trying to do ñ or ~ for example. Is there a way to overload this behavior, as I almost never want to use the tilde symbol the way spanish people does ? Karabiner looked promising but it won't let you define custom mapping. It requires you to chose between a set of predefined ones, online. 回答1: Ukelele can create a

French keyboard on macOS, altering the behavior of the tilde ~ key

♀尐吖头ヾ 提交于 2020-06-17 16:19:09
问题 Like most french users, when I want to go to my terminal home, I have to type cd ~ However, the keyboard requires me to press options + n and then space to disambiguate between me trying to do ñ or ~ for example. Is there a way to overload this behavior, as I almost never want to use the tilde symbol the way spanish people does ? Karabiner looked promising but it won't let you define custom mapping. It requires you to chose between a set of predefined ones, online. 回答1: Ukelele can create a

Whitespace before some punctuation characters in French: is there a CSS way to avoid lines breaking?

*爱你&永不变心* 提交于 2020-01-03 13:33:38
问题 For example, in this sentence, "Comment allez-vous ?", the question mark and the last word in the sentence are separated by a whitespace. When French text is written in a column, you will often get something like this: Elle zigzague pour empiéter sur des impostures ? Jacqueline porte chance. The line break happens between the last word of the sentence and the question mark, which is not desirable. Elle zigzague pour empiéter sur des impostures ? Jacqueline porte chance. Is there a way to

Parsing error for French locale with SimpleDateFormat(string,locale)

放肆的年华 提交于 2019-12-24 18:39:27
问题 I have a piece of code like this on my java side: private static DateFormat getHourFormatter(){ //DateFormatSymbols dateFormatSymbols = new DateFormatSymbols(_locale); Locale locale = Locale.FRENCH; //locale : "fr" DateFormat hourFormatter = new SimpleDateFormat( "hh:mm a",locale); //hourFormatter: simpleDateFormat@103068 locale: "fr" hourFormatter.setTimeZone( TimeZone.getTimeZone("GMT") ); return hourFormatter; //hourFormatter: SimpleDateFormat@103068 } protected static boolean