Read first N lines of a file in python

南楼画角 提交于 2019-11-26 11:34:33
John La Rooy

Python 2

with open("datafile") as myfile:
    head = [next(myfile) for x in xrange(N)]
print head

Python 3

with open("datafile") as myfile:
    head = [next(myfile) for x in range(N)]
print(head)

Here's another way (both Python 2 & 3)

from itertools import islice
with open("datafile") as myfile:
    head = list(islice(myfile, N))
print head
ghostdog74
N = 10
file = open("file.txt", "a")#the a opens it in append mode
for i in range(N):
    line = file.next().strip()
    print line
file.close()

If you want to read the first lines quickly and you don't care about performance you can use .readlines() which returns list object and then slice the list.

E.g. for the first 5 lines:

with open("pathofmyfileandfileandname") as myfile:
    firstNlines=myfile.readlines()[0:5] #put here the interval you want

Note: the whole file is read so is not the best from the performance point of view but it is easy to use, fast to write and easy to remember so if you want just perform some one-time calculation is very convenient

print firstNlines

What I do is to call the N lines using pandas. I think the performance is not the best, but for example if N=1000:

import pandas as pd
yourfile = pd.read('path/to/your/file.csv',nrows=1000)
artdanil

There is no specific method to read number of lines exposed by file object.

I guess the easiest way would be following:

lines =[]
with open(file_name) as f:
    lines.extend(f.readline() for i in xrange(N))

Based on gnibbler top voted answer (Nov 20 '09 at 0:27): this class add head() and tail() method to file object.

class File(file):
    def head(self, lines_2find=1):
        self.seek(0)                            #Rewind file
        return [self.next() for x in xrange(lines_2find)]

    def tail(self, lines_2find=1):  
        self.seek(0, 2)                         #go to end of file
        bytes_in_file = self.tell()             
        lines_found, total_bytes_scanned = 0, 0
        while (lines_2find+1 > lines_found and
               bytes_in_file > total_bytes_scanned): 
            byte_block = min(1024, bytes_in_file-total_bytes_scanned)
            self.seek(-(byte_block+total_bytes_scanned), 2)
            total_bytes_scanned += byte_block
            lines_found += self.read(1024).count('\n')
        self.seek(-total_bytes_scanned, 2)
        line_list = list(self.readlines())
        return line_list[-lines_2find:]

Usage:

f = File('path/to/file', 'r')
f.head(3)
f.tail(3)

The two most intuitive ways of doing this would be:

  1. Iterate on the file line-by-line, and break after N lines.

  2. Iterate on the file line-by-line using the next() method N times. (This is essentially just a different syntax for what the top answer does.)

Here is the code:

# Method 1:
with open("fileName", "r") as f:
    counter = 0
    for line in f:
        print line
        counter += 1
        if counter == N: break

# Method 2:
with open("fileName", "r") as f:
    for i in xrange(N):
        line = f.next()
        print line

The bottom line is, as long as you don't use readlines() or enumerateing the whole file into memory, you have plenty of options.

most convinient way on my own:

LINE_COUNT = 3
print [s for (i, s) in enumerate(open('test.txt')) if i < LINE_COUNT]

Solution based on List Comprehension The function open() supports an iteration interface. The enumerate() covers open() and return tuples (index, item), then we check that we're inside an accepted range (if i < LINE_COUNT) and then simply print the result.

Enjoy the Python. ;)

If you want something that obviously (without looking up esoteric stuff in manuals) works without imports and try/except and works on a fair range of Python 2.x versions (2.2 to 2.6):

def headn(file_name, n):
    """Like *x head -N command"""
    result = []
    nlines = 0
    assert n >= 1
    for line in open(file_name):
        result.append(line)
        nlines += 1
        if nlines >= n:
            break
    return result

if __name__ == "__main__":
    import sys
    rval = headn(sys.argv[1], int(sys.argv[2]))
    print rval
    print len(rval)

Starting at Python 2.6, you can take advantage of more sophisticated functions in the IO base clase. So the top rated answer above can be rewritten as:

    with open("datafile") as myfile:
       head = myfile.readlines(N)
    print head

(You don't have to worry about your file having less than N lines since no StopIteration exception is thrown.)

If you have a really big file, and assuming you want the output to be a numpy array, using np.genfromtxt will freeze your computer. This is so much better in my experience:

def load_big_file(fname,maxrows):
'''only works for well-formed text file of space-separated doubles'''

rows = []  # unknown number of lines, so use list

with open(fname) as f:
    j=0        
    for line in f:
        if j==maxrows:
            break
        else:
            line = [float(s) for s in line.split()]
            rows.append(np.array(line, dtype = np.double))
            j+=1
return np.vstack(rows)  # convert list of vectors to array

For first 5 lines, simply do:

N=5
with open("data_file", "r") as file:
    for i in range(N):
       print file.next()
Mansur Ali
#!/usr/bin/python

import subprocess

p = subprocess.Popen(["tail", "-n 3", "passlist"], stdout=subprocess.PIPE)

output, err = p.communicate()

print  output

This Method Worked for me

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!