How can I search within a document for a keyword and then subsequent key words within a set number of lines of the original keyword in Python?

↘锁芯ラ 提交于 2020-01-16 04:45:13

问题


I want to search for a key word in a document and then check to see whether that keyword is within 5 lines of another key word. If it is, I want to print the line and the following 50 lines.

In this example, I am searching a document for the word "carrying" and I want to make sure that the word "carrying" is within 5 lines of the words "Financial Assets:" My code is able to find and print the lines when I just include the search for "carrying", but when I include the search for "Financial Assets:" it does not find anything (even though I know it's there in the document).

import urllib2

data = []

html = urllib2.urlopen("ftp://ftp.sec.gov/edgar/data/1001627/0000950116-97-001247.txt")
searchlines = html.readlines()
for m, line in enumerate(searchlines):
    line = line.lower()
    if "carrying" in line and "Financial Assets:" in searchlines[m-5:m+5]: 
        for l in searchlines[m-5:m+50]:
            data.append(l)
print ''.join(data)

Any help would be much appreciated.


回答1:


Instead of

"Financial Assets:" in searchlines[m-5:m+5]

You need to have:

any("Financial Assets:" in line2 for line2 in searchlines[m-5:m+5])

Your original code looks for a line which contains exactly the content "Financial Assets:", instead of looking for it as a substring in each line.




回答2:


The expression

"carrying" in line

searches the string in any position inside the line. However the statement

"Finantial Assets:" in searchlines[m-5:m+5]

is searching for an exact match (i.e. a line that's exactly `"Finantial Assets:") in that sublist. You need to change this second part to something like

"Finantial Assets:" in " ".join(searchlines[m-5:m+5])


来源:https://stackoverflow.com/questions/5825055/how-can-i-search-within-a-document-for-a-keyword-and-then-subsequent-key-words-w

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!