How do I read a file line-by-line into a list?

匿名 (未验证) 提交于 2019-12-03 01:40:02

问题:

How do I read every line of a file in Python and store each line as an element in a list?

I want to read the file line by line and append each line to the end of the list.

回答1:

with open(fname) as f:     content = f.readlines() # you may also want to remove whitespace characters like `\n` at the end of each line content = [x.strip() for x in content]  

I'm guessing that you meant list and not array.



回答2:

See Input and Ouput:

with open('filename') as f:     lines = f.readlines() 

or with stripping the newline character:

lines = [line.rstrip('\n') for line in open('filename')] 

Editor's note: This answer's original whitespace-stripping command, line.strip(), as implied by Janus Troelsen's comment, would remove all leading and trailing whitespace, not just the trailing \n.



回答3:

This is more explicit than necessary, but does what you want.

with open("file.txt", "r") as ins:     array = []     for line in ins:         array.append(line) 


回答4:

This will yield an "array" of lines from the file.

lines = tuple(open(filename, 'r')) 


回答5:

If you want the \n included:

with open(fname) as f:     content = f.readlines() 

If you do not want \n included:

with open(fname) as f:     content = f.read().splitlines() 


回答6:

You could simply do the following, as has been suggested:

with open('/your/path/file') as f:     my_lines = f.readlines() 

Note that this approach has 2 downsides:

1) You store all the lines in memory. In the general case, this is a very bad idea. The file could be very large, and you could run out of memory. Even if it's not large, it is simply a waste of memory.

2) This does not allow processing of each line as you read them. So if you process your lines after this, it is not efficient (requires two passes rather than one).

A better approach for the general case would be the following:

with open('/your/path/file') as f:     for line in f:         process(line) 

Where you define your process function any way you want. For example:

def process(line):     if 'save the world' in line.lower():          superman.save_the_world() 

(The implementation of the Superman class is left as an exercise for you).

This will work nicely for any file size and you go through your file in just 1 pass. This is typically how generic parsers will work.



回答7:

if you don't care about closing the file, this one-liner works:

lines = open('file.txt').read().split("\n") 

The traditional way:

fp = open('file.txt') # open file on read mode lines = fp.read().split("\n") # create a list containing all lines fp.close() # close file 

Using with (recommended):

with open('file.txt') as fp:     lines = fp.read().split("\n") 


回答8:

This should encapsulate the open command.

array = [] with open("file.txt", "r") as f:   for line in f:     array.append(line) 


回答9:

Clean and Pythonic Way of Reading the Lines of a File Into a List


First and foremost, you should focus on opening your file and reading its contents in an efficient and pythonic way. Here is an example of the way I personally DO NOT prefer:

infile = open('my_file.txt', 'r')  # Open the file for reading.  data = infile.read()  # Read the contents of the file.  infile.close()  # Close the file since we're done using it. 

Instead, I prefer the below method of opening files for both reading and writing as it is very clean, and does not require an extra step of closing the file once you are done using it. In the statement below, we're opening the file for reading, and assigning it to the variable 'infile.' Once the code within this statement has finished running, the file will be automatically closed.

# Open the file for reading. with open('my_file.txt', 'r') as infile:      data = infile.read()  # Read the contents of the file into memory. 

Now we need to focus on bringing this data into a Python List because they are iterable, efficient, and flexible. In your case, the desired goal is to bring each line of the text file into a separate element. To accomplish this, we will use the splitlines() method as follows:

# Return a list of the lines, breaking at line boundaries. my_list = data.splitlines() 

The Final Product:

# Open the file for reading. with open('my_file.txt', 'r') as infile:      data = infile.read()  # Read the contents of the file into memory.  # Return a list of the lines, breaking at line boundaries. my_list = data.splitlines() 

Testing Our Code:

  • Contents of the text file:
  • Print statements for testing purposes:
    print my_list  # Print the list.      # Print each line in the list.     for line in my_list:         print line      # Print the fourth element in this list.     print my_list[3] 
  • Output (different-looking because of unicode characters):


回答10:

I'd do it like this.

lines = [] with open("myfile.txt") as f:     for line in f:         lines.append(line) 


回答11:

Data into list

Let's read data from a text file

Text file content:

   line 1    line 2    line 3 
  1. Open the cmd in the same dir (right click the mouse and choose cmd or powershell)
  2. run python and in the interpreter write:

The python script

>>> with open("myfile.txt", encoding="utf-8") as file: ...     x = [l.strip() for l in file] >>> x ['line 1','line 2','line 3'] 

Using append

x = [] with open("myfile.txt") as file:     for l in file:         x.append(l.strip()) 

or...

>>> x = open("myfile.txt").read().splitlines() >>> x ['line 1','line 2','line 3'] 

or...

>>> y = [x.rstrip() for x in open("my_file.txt")] >>> y ['line 1','line 2','line 3'] 

Getting a text from a web page with python 3

Here there's a pratical example of a text grabbed from the net. The page contains just plain text. We just need to get rid of the \n \r and b' thing, in it to keep it clean to be printed. There's a convertion of bytes data to string data and then the string is splitted in lines or, better, each line is stored into an item of the list. when the print function is called, we pass each item of the list without the \r and \' and b' things that make the text less readeable.

from urllib.request import urlopen testo = urlopen("https://www.gutenberg.org/files/11/11.txt").read() testo = str(testo).split("\\n") for l in testo[30:48]:     print(l.replace("\\r","").replace("\\'","\'").replace("b'","")) 

OUTPUT:

ALICE'S ADVENTURES IN WONDERLAND

Lewis Carroll

THE MILLENNIUM FULCRUM EDITION 3.0

CHAPTER I. Down the Rabbit-Hole

Alice was beginning to get very tired of sitting by her sister on the bank, and of having nothing to do: once or twice she had peeped into the book her sister was reading, but it had no pictures or conversations in it, 'and what is the use of a book,' thought Alice 'without pictures or conversations?'



回答12:

Here's one more option by using list comprehensions on files;

lines = [line.rstrip() for line in open('file.txt')] 

This should be more efficient way as the most of the work is done inside the Python interpreter.



回答13:

Another option is numpy.genfromtxt, for example:

import numpy as np data = np.genfromtxt("yourfile.dat",delimiter="\n") 

This will make data a NumPy array with as many rows as are in your file.



回答14:

If you'd like to read a file from the command line or from stdin, you can also use the fileinput module:

# reader.py import fileinput  content = [] for line in fileinput.input():     content.append(line.strip())  fileinput.close() 

Pass files to it like so:

$ python reader.py textfile.txt  

Read more here: http://docs.python.org/2/library/fileinput.html



回答15:

The simplest way to do it

A simple way is to:

  1. Read the whole file as a string
  2. Split the string line by line

In one line, that would give:

lines = open('C:/path/file.txt').read().splitlines() 


回答16:

To read a file into a list you need to do three things:

  • Open the file
  • Read the file
  • Store the contents as list

Fortunately Python makes it very easy to do these things so the shortest way to read a file into a list is:

lst = list(open(filename)) 

However I'll add some more explanation.

Opening the file

I assume that you want to open a specific file and you don't deal directly with a file-handle (or a file-like-handle). The most commonly used function to open a file in Python is open, it takes one mandatory argument and two optional ones in Python 2.7:

  • Filename
  • Mode
  • Buffering (I'll ignore this argument in this answer)

The filename should be a string that represents the path to the file. For example:

open('afile')   # opens the file named afile in the current working directory open('adir/afile')            # relative path (relative to the current working directory) open('C:/users/aname/afile')  # absolute path (windows) open('/usr/local/afile')      # absolute path (linux) 

Note that the file extension needs to be specified. This is especially important for Windows users because file extensions like .txt or .doc, etc. are hidden by default when viewed in the explorer.

The second argument is the mode, it's r by default which means "read-only". That's exactly what you need in your case.

But in case you actually want to create a file and/or write to a file you'll need a different argument here. There is an excellent answer if you want an overview.

For reading a file you can omit the mode or pass it in explicitly:

open(filename) open(filename, 'r') 

Both will open the file in read-only mode. In case you want to read in a binary file on Windows you need to use the mode rb:

open(filename, 'rb') 

On other platforms the 'b' (binary mode) is simply ignored.


Now that I've shown how to open the file, let's talk about the fact that you always need to close it again. Otherwise it will keep an open file-handle to the file until the process exits (or Python garbages the file-handle).

While you could use:

f = open(filename) # ... do stuff with f f.close() 

That will fail to close the file when something between open and close throws an exception. You could avoid that by using a try and finally:

f = open(filename) # nothing in between! try:     # do stuff with f finally:     f.close() 

However Python provides context managers that have a prettier syntax (but for open it's almost identical to the try and finally above):

with open(filename) as f:     # do stuff with f # The file is always closed after the with-scope ends. 

The last approach is the recommended approach to open a file in Python!

Reading the file

Okay, you've opened the file, now how to read it?

The open function returns a file object and it supports Pythons iteration protocol. Each iteration will give you a line:

with open(filename) as f:     for line in f:         print(line) 

This will print each line of the file. Note however that each line will contain a newline character \n at the end (probably \r\n on Windows). If you don't want that you can could simply remove the last character (or the last two characters on Windows):

with open(filename) as f:     for line in f:         print(line[:-1]) 

But the last line doesn't necessarily has a trailing newline, so one shouldn't use that. One could check if it ends with a trailing newline and if so remove it:

with open(filename) as f:     for line in f:         if line.endswith('\n'):             line = line[:-1]         print(line) 

But you could simply remove all whitespaces (including the \n character) from the end of the string, this will also remove all other trailing whitespaces so you have to be careful if these are important:

with open(filename) as f:     for line in f:         print(f.rstrip()) 

However if the lines end with \r\n (Windows "newlines") that .rstrip() will also take care of the \r!

Store the contents as list

Now that you know how to open the file and read it, it's time to store the contents in a list. The simplest option would be to use the list function:

with open(filename) as f:     lst = list(f) 

Or in case you want to strip the trailing newlines and a list-comprehension:

with open(filename) as f:     lst = [line.rstrip() for line in f] 

Or even simpler: The .readlines() method of the file object by default returns a list of the lines:

with open(filename) as f:     lst = f.readlines() 

This will also include the trailing newline characters, if you don't want them I would recommend the [line.rstrip() for line in f] approach because it avoids keeping two lists (if you used [line.rstrip() for line in f.readlines()]) containing all the lines in memory.

There's an additional option to get the desired output, however it's rather "suboptimal": read the complete file in a string and then split on newlines:

with open(filename) as f:     lst = f.read().split('\n') 

It takes care of the trailing newlines automatically because the split character isn't included. However it's suboptimal because you keep the file as string and as a list of lines in memory!

Summary

  • Use with open(...) as f when opening files because you don't need to take care of closing the file yourself and it's exception-proof.
  • file objects support the iteration protocol so reading a file line-by-line is as simple as for line in the_file_object:.
  • Always browse the documentation for the available functions/classes. Most of the time there's a perfect match for the task or at least a few good ones. The obvious choice in this case would be readlines() but if you want to process the lines before storing them in the list I would recommend a simple list-comprehension.


回答17:

f = open("your_file.txt",'r') out = f.readlines() # will append in the list out 

Now variable out is a list (array) of what you want. You could either do:

for line in out:     print line 

or

for line in f:     print line 

you'll get the same results.



回答18:

Just use the splitlines() functions. Here is an example.

inp = "file.txt" data = open(inp) dat = data.read() lst = dat.splitlines() print lst # print(lst) # for python 3 

In the output you will have the list of lines.



回答19:

A real easy way:

with open(file) as g:     stuff = g.readlines() 

If you want to make it a fully-fledged program, type this in:

file = raw_input ("Enter EXACT file name: ") with open(file) as g:     stuff = g.readlines() print (stuff) exit = raw_input("Press enter when you are done.") 

For some reason, it doesn't read .py files properly.



回答20:

If you want to are faced with a very large / huge file and want to read faster (imagine you are in a Topcoder/Hackerrank coding competition), you might read a considerably bigger chunk of lines into a memory buffer at one time, rather than just iterate line by line at file level.

buffersize = 2**16 with open(path) as f:      while True:         lines_buffer = f.readlines(buffersize)         if not lines_buffer:             break         for line in lines_buffer:             process(line) 


回答21:

You can just open your file for reading using

file1 = open("filename","r") # and for reading use lines = file1.readlines() file1.close() 

The list lines will contain all your lines as individual elements and you can call a specific element using lines["linenumber-1"] as python starts its counting from 0.



回答22:

To my knowledge Python doesn't have a native array data structure. But it does support the list data structure which is much simpler to use than an array.

array = [] #declaring a list with name '**array**' with open(PATH,'r') as reader :     for line in reader :         array.append(line) 


回答23:

Use this:

import pandas as pd data = pd.read_csv(filename) # You can also add parameters such as header, sep, etc. array = data.values 

data is a dataframe type, and uses values to get ndarray. You can also get a list by using array.tolist().



回答24:

Could also use the loadtxt command in numpy. This checks for fewer conditions than genfromtxt so it may be faster.

import numpy  data = numpy.loadtxt(filename,delimiter="\n") 


回答25:

You can easily do it by the following piece of code:

lines = open(filePath).readlines() 


回答26:

Command line version

#!/bin/python3 import os import sys abspath = os.path.abspath(__file__) dname = os.path.dirname(abspath) filename = dname + sys.argv[1] arr = open(filename).read().split("\n")  print(arr) 

Run with:

python3 somefile.py input_file_name.txt 


回答27:

Read and write text files with Python 2+3; works with unicode

Things to notice:

  • with is a so called context manager. It makes sure that the opened file is closed again.
  • All solutions here which simply make .strip() or .rstrip() will fail to reproduce the lines as they also strip the white space.

Common file endings

.txt

More advanced file writing / reading

For your application, the following might be important:

  • Support by other programming languages
  • Reading / writing performance
  • Compactness (file size)

See also: Comparison of data serialization formats

In case you are rather looking for a way to make configuration files, you might want to read my short article Configuration files in Python



回答28:

lines = list(open("dict.lst", "r")) linesSanitized = map(lambda each:each.strip("\n"), lines) print linesSanitized 


回答29:

with open(fname) as fo:         data=fo.read().replace('\n', ' ').replace (',', ' ') 

This should answer your question. The replace function will act as delimiter to strip the file.



回答30:

textFile = open("E:\Values.txt","r") textFileLines = textFile.readlines() 

"textFileLines" is the array you wanted



标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!