text-processing | 易学教程

Algorithm for Negating Sentences

阅读更多关于 Algorithm for Negating Sentences

问题 I was wondering if anyone was familiar with any attempts at algorithmic sentence negation. For example, given a sentence like "This book is good" provide any number of alternative sentences meaning the opposite like "This book is not good" or even "This book is bad". Obviously, accomplishing this with a high degree of accuracy would probably be beyond the scope of current NLP, but I'm sure there has been some work on the subject. If anybody knows of any work, care to point me to some papers?

Algorithm for Negating Sentences

阅读更多关于 Algorithm for Negating Sentences

Identifying verb tenses in python

阅读更多关于 Identifying verb tenses in python

问题 How can I use Python + NLTK to identify whether a sentence refers to the past/present/future ? Can I do this only using POS tagging? This seems a bit inaccurate, seems to me that I need to consider the sentence context and not only the words alone. Any suggestion for another library that can do that? 回答1: It won't be too hard to do this yourself. This table should help you identify the different verb tenses and handling them will just be a matter of parsing the result of nltk.pos_tag(string)

How do I read information from text files?

阅读更多关于 How do I read information from text files?

问题 I have hundreds of text files with the following information in each file: *****Auto-Corelation Results****** 1 .09 -.19 .18 non-Significant *****STATISTICS FOR MANN-KENDELL TEST****** S= 609 VAR(S)= 162409.70 Z= 1.51 Random : No trend at 95% *****SENs STATISTICS ****** SEN SLOPE = .24 Now, I want to read all these files, and "collect" Sen's Statistics from each file (eg. .24 ) and compile into one file along with the corresponding file names. I have to do it in R. I have worked with CSV

Return a list of matches by given phrase

阅读更多关于 Return a list of matches by given phrase

问题 I'm trying to make a method which can check whether a given phrase matches at least one item from list of phrases and returns them. Input is the phrase, a list of phrases and a dictionary of lists of synonyms. The point is to make it universal. Here is the example: phrase = 'This is a little house' dictSyns = {'little':['small','tiny','little'], 'house':['cottage','house']} listPhrases = ['This is a tiny house','This is a small cottage','This is a small building','I need advice'] I can create

Return a list of matches by given phrase

阅读更多关于 Return a list of matches by given phrase

How to get Git log with short stat in one line?

阅读更多关于 How to get Git log with short stat in one line?

问题 Following command outputs following lines of text on console git log --pretty=format:"%h;%ai;%s" --shortstat ed6e0ab;2014-01-07 16:32:39 +0530;Foo 3 files changed, 14 insertions(+), 13 deletions(-) cdfbb10;2014-01-07 14:59:48 +0530;Bar 1 file changed, 21 insertions(+) 5fde3e1;2014-01-06 17:26:40 +0530;Merge Baz 772b277;2014-01-06 17:09:42 +0530;Qux 7 files changed, 72 insertions(+), 7 deletions(-) I'm interested in having above format to be displayed like this ed6e0ab;2014-01-07 16:32:39

Remove empty lines in a text file via grep

阅读更多关于 Remove empty lines in a text file via grep

问题 FILE : hello world foo bar How can when remove all the empty new lines in this FILE ? Output of command: FILE : hello world foo bar 回答1: grep . FILE (And if you really want to do it in sed, then: sed -e /^$/d FILE ) (And if you really want to do it in awk, then: awk /./ FILE ) 回答2: Try the following: grep -v -e '^$' 回答3: with awk, just check for number of fields. no need regex $ more file hello world foo bar $ awk 'NF' file hello world foo bar 回答4: Here is a solution that removes all lines

How to strip trailing whitespace in CMake variable?

阅读更多关于 How to strip trailing whitespace in CMake variable?

问题 We are trying to improve the makefiles produced by CMake. For Clang, GCC and ICC, we want to add -march=native . The block to do so looks like: # -march=native for GCC, Clang and ICC on i386, i486, i586, i686 and x86_64. message(STATUS, "1") message(STATUS, "Compiler: x${CMAKE_CXX_COMPILER_ID}x") if ("${CMAKE_CXX_COMPILER_ID}" STREQUAL "Clang" OR "${CMAKE_CXX_COMPILER_ID}" STREQUAL "GNU" OR "${CMAKE_CXX_COMPILER_ID}" STREQUAL "Intel") message(STATUS, "2") message(STATUS, "Machine: x${UNAME

Python: Best Way to remove duplicate character from string

阅读更多关于 Python: Best Way to remove duplicate character from string

问题 How can I remove duplicate characters from a string using Python? For example, let's say I have a string: foo = "SSYYNNOOPPSSIISS" How can I make the string: foo = SYNOPSIS I'm new to python and What I have tired and it's working. I knew there is smart and best way to do this.. and only experience can show this.. def RemoveDupliChar(Word): NewWord = " " index = 0 for char in Word: if char != NewWord[index]: NewWord += char index += 1 print(NewWord.strip()) NOTE: Order is important and this