linguistics

How can I correctly prefix a word with “a” and “an”?

依然范特西╮ 提交于 2019-11-27 02:57:00
I have a .NET application where, given a noun, I want it to correctly prefix that word with "a" or "an". How would I do that? Before you think the answer is to simply check if the first letter is a vowel, consider phrases like: an honest mistake a used car Download Wikipedia Unzip it and write a quick filter program that spits out only article text (the download is generally in XML format, along with non-article metadata too). Find all instances of a(n).... and make an index on the following word and all of its prefixes (you can use a simple suffixtrie for this). This should be case sensitive,

How do I determine if a random string sounds like English?

筅森魡賤 提交于 2019-11-27 02:11:42
问题 I have an algorithm that generates strings based on a list of input words. How do I separate only the strings that sounds like English words? ie. discard RDLO while keeping LORD . EDIT: To clarify, they do not need to be actual words in the dictionary. They just need to sound like English. For example KEAL would be accepted. 回答1: You can build a markov-chain of a huge english text. Afterwards you can feed words into the markov chain and check how high the probability is that the word is

LSA - Latent Semantic Analysis - How to code it in PHP?

狂风中的少年 提交于 2019-11-26 22:46:57
问题 I would like to implement Latent Semantic Analysis (LSA) in PHP in order to find out topics/tags for texts. Here is what I think I have to do. Is this correct? How can I code it in PHP? How do I determine which words to chose? I don't want to use any external libraries. I've already an implementation for the Singular Value Decomposition (SVD). Extract all words from the given text. Weight the words/phrases, e.g. with tf–idf. If weighting is too complex, just take the number of occurrences.

Translating human languages in Python [closed]

我是研究僧i 提交于 2019-11-26 18:22:03
问题 Is there a Python module for the translation of texts from one human language to another? I'm planning to work with texts that are to be pre and post processed with Python scripts. What other Python-integrated approaches can be used? 回答1: If you're looking to actually translate a string of text between two languages, say from English "Hello" to Spanish "Hola", you might want to look into the Google Language API. Another alternative due to recent deprecation of the free version of Google's API