问题
What is the difference between :
with open("file.txt", "r") as f:
data = list(f)
Or :
with open("file.txt", "r") as f:
data = f.read().splitlines(True)
Or :
with open("file.txt", "r") as f:
data = f.readlines()
They seem to produce the exact same output. Is one better (or more pythonic) than the other ?
回答1:
Explicit is better than implicit, so I prefer:
with open("file.txt", "r") as f:
data = f.readlines()
But, when it is possible, the most pythonic is to use the file iterator directly, without loading all the content to memory, e.g.:
with open("file.txt", "r") as f:
for line in f:
my_function(line)
回答2:
TL;DR;
Considering you need a list to manipulate them afterwards, your three proposed solutions are all syntactically valid. There is no better (or more pythonic) solution, especially since they all are recommended by the official Python documentation. So, choose the one you find the most readable and be consistent with it throughout your code. If performance is a deciding factor, see my timeit
analysis below.
Here is the timeit
(10000 loops, ~20 line in test.txt
),
import timeit
def foo():
with open("test.txt", "r") as f:
data = list(f)
def foo1():
with open("test.txt", "r") as f:
data = f.read().splitlines(True)
def foo2():
with open("test.txt", "r") as f:
data = f.readlines()
print(timeit.timeit(stmt=foo, number=10000))
print(timeit.timeit(stmt=foo1, number=10000))
print(timeit.timeit(stmt=foo2, number=10000))
>>>> 1.6370758459997887
>>>> 1.410844805999659
>>>> 1.8176437409965729
I tried it with multiple number of loops and lines, and f.read().splitlines(True)
always seems to be performing a bit better than the two others.
Now, syntactically speaking, all of your examples seems to be valid. Refer to this documentation for more informations.
According to it, if your goal is to read lines form a file,
for line in f:
...
where they states that it is memory efficient, fast, and leads to simple code. Which would be another good alternative in your case if you don't need to manipulate them in a list.
EDIT
Note that you don't need to pass your True
boolean to splitlines
. It has your wanted behavior by default.
My personal recommendation
I don't want to make this answer too opinion-based, but I think it would be beneficial for you to know, that I don't think performance should be your deciding factor until it is actually a problem for you. Especially since all syntax are allowed and recommended in the official Python doc I linked.
So, my advice is,:
First, pick the most logical one for your particular case and then choose the one you find the most readable and be consistent with it throughout your code.
回答3:
In the 3 cases, you're using a context manager
to read a file. This file is a file object
.
File Object
An object exposing a file-oriented API (with methods such as read() or write()). Depending on the way it was created, a file object can mediate access to a real on-disk file or to another type of storage or communication device (for example standard input/output, in-memory buffers, sockets, pipes, etc.). File objects are also called file-like objects or streams. The canonical way to create a file object is by using the open() function. https://docs.python.org/3/glossary.html#term-file-object
list
with open("file.txt", "r") as f:
data = list(f)
This works because your file object is a stream like object. converting to list works roughly like this :
[element for element in generator until I hit stopIteration]
readlines method
with open("file.txt", "r") as f:
data = f.readlines()
The method readlines() reads until EOF using readline() and returns a list containing the lines.
Difference with list :
You can specify the number of elements you want to read :
fileObject.readlines( sizehint )
If the optional sizehint argument is present, instead of reading up to EOF, whole lines totalling approximately sizehint bytes (possibly after rounding up to an internal buffer size) are read.
read
When should I ever use file.read() or file.readlines()?
回答4:
They're all achieving the same goal of returning a list of strings but using separate approaches. f.readlines()
is the most Pythonic.
with open("file.txt", "r") as f:
data = list(f)
f
here is a file-like object, which is being iterated over through list
, which returns lines in the file.
with open("file.txt", "r") as f:
data = f.read().splitlines(True)
f.read()
returns a string, which you split on newlines, returning a list of strings.
with open("file.txt", "r") as f:
data = f.readlines()
f.readlines()
does the same as above, it reads the entire file and splits on newlines.
来源:https://stackoverflow.com/questions/51479759/is-there-a-difference-between-file-readlines-listfile-and-file-read