Read multiple block of file between start and stop flags

前端 未结 4 1175
轮回少年
轮回少年 2020-12-06 23:53

I am trying to read sections of a file into numpy arrays that have similar start and stop flags for the different sections of the file. At the moment I have found a method

相关标签:
4条回答
  • 2020-12-07 00:27

    Let's say this is your file to read:

    **starting** blabla blabla **starting** bleble bleble **starting** bumbum bumbum

    This is code of the program:

    file = open("testfile.txt", "r")
    data = file.read()
    file.close
    data = data.split("**starting**")
    print(data)
    

    And this is output:

    ['', '\nblabla\nblabla\n', '\nbleble\nbleble\n', '\nbumbum\nbumbum']

    Later you can del empty element, or do other operation in your data. split function is buildin for string objects and can get more complicated strings as arguments.

    0 讨论(0)
  • 2020-12-07 00:37

    You have indentation problem, your code should look like this:

    with open("myFile.txt") as f:
        array = []
        parsing = False
        for line in f:
            if line.startswith('stop flag'):
            parsing = False
            if parsing:
            #do things to the data
            if line.startswith('start flag'):
            parsing = True
    
    0 讨论(0)
  • 2020-12-07 00:41

    You can use itertools.takewhile each time you reach the start flag to take until the stop:

    from itertools import takewhile
    with open("myFile.txt") as f:
            array = []
            for line in f:
                if line.startswith('start flag'):               
                    data = takewhile(lambda x: not x.startswith("stop flag"),f)
                    # use data and repeat
    

    Or just use an inner loop:

    with open("myFile.txt") as f:
        array = []
        for line in f:
            if line.startswith('start flag'):
                # beginning of section use first lin
                for line in f:
                    # check for end of section breaking if we find the stop lone
                    if line.startswith("stop flag"):
                        break
                     # else process lines from section
    

    A file object returns its own iterator so the pointer will keep moving as you iterate over f, when you reach the start flag, start processing a section until you hit the stop. There is no reason to re-open the file at all, just use the sections as you iterate once over the lines of the file. If the start and stop flag lines are considered part of the section make sure to also use those too.

    0 讨论(0)
  • 2020-12-07 00:49

    The solution similar to yours would be:

    result = []
    parse = False
    with open("myFile.txt") as f:
        for line in f:
            if line.startswith('stop flag'):
                parse = False
            elif line.startswith('start flag'):
                parse = True
            elif parse:
                result.append(line)
            else:  # not needed, but I like to always add else clause
                continue
    print result
    

    But you might also use inner loop or itertools.takewhile as other answers suggest. Especially using takewhile should be significantly faster for really big files.

    0 讨论(0)
提交回复
热议问题