Extract string between characters from a txt file in python [closed]

故事扮演 提交于 2019-12-13 00:39:19

问题


I have a txt file that I want python to read, and from which I want python to extract a string specifically between two characters. Here is an example:

Line a

Line b

Line c

&TESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTEST !

Line d

Line e

What I want is python to read the lines and when it encounters "&" I want it to start printing the lines (including the line with "$") up untill it encounters "!"

Any suggestions?


回答1:


This works:

data=[]
flag=False
with open('/tmp/test.txt','r') as f:
    for line in f:
        if line.startswith('&'):
            flag=True
        if flag:
            data.append(line)
        if line.strip().endswith('!'):
            flag=False

print ''.join(data)  

If you file is small enough that reading it all into memory is not an issue, and there is no ambiguity in & or ! as the start and end of the string you want, this is easier:

with open('/tmp/test.txt','r') as f:
    data=''.join(f.readlines())    

print data[data.index('&'):data.index('!')+1] 

Or, if you want to read the whole file in but only use & and ! if they are are at the beginning and end of the lines respectively, you can use a regex:

import re

with open('/tmp/test.txt','r') as f:
    data=''.join(f.readlines())    

m=re.search(r'^(&.*!)\s*?\n',data,re.S | re.M)    
if m: print m.group(1)   



回答2:


Here is an ( very simple! ) example.

def Printer():
    f = open("yourfile.txt")
    Pr = False
    for line in f.readlines():
        if Pr: print line
        if "&" in line:
            Pr = True
            print line
        if "!" in line:
            Pr = False
    f.close()



回答3:


One simple solution is shown below. Code contains lots of comments to make you understand each line of code. Beauty of code is, it uses with operator to take care of exceptions and closing the resources (such as files).

#Specify the absolute path to the input file.
file_path = "input.txt" 

#Open the file in read mode. with operator is used to take care of try..except..finally block.
with open(file_path, "r") as f:
    '''Read the contents of file. Be careful here as this will read the entire file into memory. 
       If file is too large prefer iterating over file object
    ''' 
    content = f.read()
    size = len(content)
    start =0
    while start < size:
        # Read the starting index of & after the last ! index.
        start = content.find("&",start)
        # If found, continue else go to end of contents (this is just to avoid writing if statements.
        start = start if start != -1 else size
        # Read the starting index of ! after the last $ index.
        end = content.find("!", start)
        # Again, if found, continue else go to end of contents (this is just to avoid writing if statements.
        end = end if end != -1 else size
        '''print the contents between $ and ! (excluding both these operators. 
           If no ! character is found, print till the end of file.
        ''' 
        print content[start+1:end]
        # Move forward our cursor after the position of ! character. 
        start = end + 1


来源:https://stackoverflow.com/questions/17632342/extract-string-between-characters-from-a-txt-file-in-python

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!