cjk

Chinese queries result in unexpectly high recall

陌路散爱 提交于 2021-02-11 12:46:41
问题 We experience unexpectedly high recall for Chinese queries. I have managed to reproduce a minimal use-case using a simple data model with only 2 properties. REPRODUCE Define a property DescriptionZhCn for Chinese product descriptions, using zh-Hans.microsoft analyzer Populate two records with the following values in DescriptionZhCn Contoso 减振接杆 Contoso 缩径接柄 Search using options searchMode=all, queryType=full, searchFields=DescriptionZhCn, api-version=2019-05-06 with the following values in

Chinese queries result in unexpectly high recall

ぃ、小莉子 提交于 2021-02-11 12:45:33
问题 We experience unexpectedly high recall for Chinese queries. I have managed to reproduce a minimal use-case using a simple data model with only 2 properties. REPRODUCE Define a property DescriptionZhCn for Chinese product descriptions, using zh-Hans.microsoft analyzer Populate two records with the following values in DescriptionZhCn Contoso 减振接杆 Contoso 缩径接柄 Search using options searchMode=all, queryType=full, searchFields=DescriptionZhCn, api-version=2019-05-06 with the following values in

How to set a “backup” font

孤街浪徒 提交于 2021-02-10 15:29:10
问题 I'm using Java 10, and swing. I'm using the DejaVu Sans Mono font, because I think it looks nice. However, it has very poor support of CJK characters, so to maximize support, I thought of using Noto Sans CJK as a backup. Although I could use Noto Sans for Latin characters too, I don't quite like their Latin characters too much. Although this seems like a trivial question, I can't seem to find out how to do it. Even if you can't answer, I appreciate pointers. TL;DR: My Problem I can't display

Japanese Character Encoding in Java

烈酒焚心 提交于 2021-02-07 14:15:04
问题 Here's my problem. I'm now using using Java Apache POI to read an Excel (.xls or .xlsx) file, and display the contents. There are some Japanese chars in the spreadsheet and all of the Japanese chars I got are "???" in my output. I tried to use Shift-JIS, UTF-8 and many other encoding ways, but it doesn't work... Here's my encoding code below: public String encoding(String str) throws UnsupportedEncodingException{ String Encoding = "Shift_JIS"; return this.changeCharset(str, Encoding); }

Japanese Character Encoding in Java

北城以北 提交于 2021-02-07 14:14:23
问题 Here's my problem. I'm now using using Java Apache POI to read an Excel (.xls or .xlsx) file, and display the contents. There are some Japanese chars in the spreadsheet and all of the Japanese chars I got are "???" in my output. I tried to use Shift-JIS, UTF-8 and many other encoding ways, but it doesn't work... Here's my encoding code below: public String encoding(String str) throws UnsupportedEncodingException{ String Encoding = "Shift_JIS"; return this.changeCharset(str, Encoding); }

Can I get Console to show Chinese?

孤者浪人 提交于 2021-02-07 11:54:53
问题 I've always wondered if it would be possible to show UTF8 or UTF16-Chinese text in a Console window, e.g., Console.WriteLine(chinese). For the time being, it shows up as ???. Is it possible to kick up a Console session that supports Chinese characters? 回答1: urxvt, the Unicode rxvt, is a Xwindow "console" that will show Chinese characters. Assuming you're using Windows, this can work under Cygwin or coLinux. also see Unicode characters in Windows command line - how?. I haven't yet figured out

Using Chinese to build a dictionary in Python

我是研究僧i 提交于 2021-02-04 13:53:29
问题 so this is my first time here, and also I am new to the world of Python. I am studying Chinese also and I wanted to create a program to review Chinese vocabulary using a dictionary. Here is the code that I normally use: #!/usr/bin/python # -*- coding:utf-8-*- dictionary = {"Hello" : "你好"} # Simple example to save time print(dictionary) The results I keep getting are something like: {'hello': '\xe4\xbd\xa0\xe5\xa5\xbd'} I have also trying adding a "u" to the beginning of the string with the

How to split Japanese text?

限于喜欢 提交于 2021-01-27 17:16:25
问题 What is the best way of splitting Japanese text using Java? For Example, for the below text: こんにちは。私の名前はオバマです。私はアメリカに行く。 I need the following output: こんにちは 私の名前はオバマです 私はアメリカに行く Is it possible using Kuromoji? 回答1: You can use java.text.BreakIterator. String TEXT = "こんにちは。私の名前はオバマです。私はアメリカに行く。"; BreakIterator boundary = BreakIterator.getSentenceInstance(Locale.JAPAN); boundary.setText(TEXT); int start = boundary.first(); for (int end = boundary.next(); end != BreakIterator.DONE; start = end,

Convert Japanese wstring to std::string

坚强是说给别人听的谎言 提交于 2020-06-29 06:44:19
问题 Can anyone suggest a good method to convert a Japanese std::wstring to std::string ? I used the below code. Japanese strings are not converting properly on an English OS. std::string WstringTostring(std::wstring str) { size_t size = 0; _locale_t lc = _create_locale(LC_ALL, "ja.JP.utf8"); errno_t err = _wcstombs_s_l(&size, NULL, 0, &str[0], _TRUNCATE, lc); std::string ret = std::string(size, 0); err = _wcstombs_s_l(&size, &ret[0], size, &str[0], _TRUNCATE, lc); _free_locale(lc); ret.resize

Visual Basic for MS Word code not working for Japanese

纵然是瞬间 提交于 2020-04-30 06:35:11
问题 I'm using the following VB code (authored by macropod, see this stackoverflow question) inside MS Word (Word for Mac v16.16.21) to mark errors and insert the first spell checker suggestion inside a document: Sub SpellCheck() Dim Rng As Range, oSuggestions As Variant For Each Rng In ActiveDocument.Range.SpellingErrors With Rng If .GetSpellingSuggestions.Count > 0 Then Set oSuggestions = .GetSpellingSuggestions .Text = "[" & .Text & "][" & oSuggestions(1) & "]" Else .Text = "[" & .Text & "][]"