发表新帖

发表新帖

How to determine the language(English, Chinese…) of a given string in Oracle?

后端未结

关注

 4  2133

再見小時候 2021-02-06 00:46

How to determine the language (English, Chinese...) of a given sting (table column value) in Oracle(multi language environment)?

4条回答

旧时难觅i (楼主)

2021-02-06 01:33

A possible solution could be:

1) maintain some dictionary.txt files in the languages you are expecting

2) when parsing the input string in question, use something like a Scanner to read each word and search for it in the most expected dictionary, until a reasonable number of matches or fails allows you to assert the string is not from that language (maybe a certain percentage).

3) Check the next most likely dictionary, etc, until you find the answer, or cannot determine it.

For example, have englishDict.txt, spanishDict.txt, and frenchDict.txt, and maybe check if the 1st 100 words exist in the englishDict.txt first, and if you find a reasonable number (say, 70 out of 100), you can reasonably assume it is in English; otherwise, check the next file. Or, you could also read from each Dictionary, and select the result with the most matches.

Alternately, you could search for commonly used language words first, such as articles, pronouns and common verbs. I have a feeling that no matter the solution, you're going to have to perform some number of searches and comparisons to find the answer.

0 讨论(0)

查看其它4个回答
发布评论:

提交评论
- 加载中...

热议问题