Looping over list and removing entries in Python

问题

I want to loop over a list in Python and remove particular items. I don't want to make a new list of accepted items, because in my complete example, I want to make a series of refinements to my list. Here is a simple example, in which I try to remove all numbers that are less than 3 in a list.

example = [1.,2.,3.,4.,5.,6.]
for e in example:
  if e < 3.:
    print "Removing:", e
    example.remove(e)
  else:
    print "Accepting:", e
print "NAIVE:", example 

Removing: 1.0
Accepting: 3.0
Accepting: 4.0
Accepting: 5.0
Accepting: 6.0
NAIVE: [2.0, 3.0, 4.0, 5.0, 6.0]

It fails. I think it fails because the removal of an item in a list messes with the indices that the for loop is running over, i.e. once the item 1. is removed, the item 2. is at place 0 in the list, but by that time, the loop is at place 1.

I can fix this with deepcopy as follows:

example = [1.,2.,3.,4.,5.,6.]
import copy
for e in copy.deepcopy(example):
  if e < 3.:
    print "Removing:", e
    example.remove(e)
  else:
    print "Accepting:", e
print "DEEPCOPY:",  example

Removing: 1.0
Removing: 2.0
Accepting: 3.0
Accepting: 4.0
Accepting: 5.0
Accepting: 6.0
DEEPCOPY: [3.0, 4.0, 5.0, 6.0]

This works here, but is it good practice? Will result in other unexpected bugs? Is there a better way to achieve this? or is this construction (looping and removing from a list) fundamentally unsound?

I don't want to make a new list of accepted items because I want to apply a series of criteria to my list, one by one, and remove items accordingly. I don't want a new list for each criteria I apply (which could be many) and I don't want to apply all my criteria in one go either (because it's helpful to see how many items are removed by each criteria etc).

回答1:

I don't see why you wouldn't just construct a new list with the items you want to keep, since you don't seem to care about constructing a new list (after all, that is what copy does).

So I would just do

example = [f for f in example if f >= 3]

If you do want to iterate over the list and change it, perhaps iterate over indices and go backwards:

for i in range(len(example) - 1, -1, -1):
    if example[i] < 3:
        del example[i]

But that's a bit special, I would avoid it unless really necessary.

To show that you don't need silly example_1, example_2, old_example etc variables, consider:

# Here is a number of tests for things we want throw out
def some_really_complicated_test_on_a_number(f):
    ... put any kind of code here and return True if we want to
    delete the number...

TESTS = (
    lambda f: f < 3,
    lambda f: f > 16,
    lambda f: (int(f) % 2) == 1,  # Integer is odd
    some_really_complicated_test_on_a_number,
    # etc
)

Here's a function that takes a list and a test, prints the items with "accepting" and "rejecting", and returns a new list with the remaining items:

def filter_with_prints(l, test):
    result = []
    for f in l:
         if test(f):
             print("Rejecting: {}".format(f))
         else:
             result.append(f)
             print("Accepting: {}".format(f))
    return result

And we can call a lot of tests like this:

example = [1., 2., 3., 4., 5., 6.]

for test in TESTS:
    example = filter_with_prints(example, test)

回答2:

You are right, the problem is that you are modifying the list you are iterating through during the loop. That's very inconsistent and leads to many errors. My question would be why you are specifically interested in removing items of the list rather than generating a new copy that meets your suggestions? Is there a specific requirement for that? Otherwise I would suggest to make a new copy of the list that meets your restrictions instead of modifying the input list. So, modifying your code:

example = [1.,2.,3.,4.,5.,6.]
new_list = []
for e in example:
   if e >= 3.:
      new_list.append(e)
      print "Accepting:", e
   else:
      print "Removing: ", e

This is less error prone, but you could be more pythonic and use list comprehension for that:

new_list = [e for e in example if e >= 3.]

Edit: I see that the reason you want to remove items instead of creating new lists is that you are going through the list several times to filter the list. I still think that even in that case it is more readable, less error prone and not specially less efficient to create a new list each time. If efficiency was the problem and you had very large lists or something like that, I would try to only iterate through the list once and remove all non-valid items at the same loop. However, if you really want to remove items from the list, you can do as @RemcoGerlich says and go backwards iterating by index.

来源：https://stackoverflow.com/questions/31558845/looping-over-list-and-removing-entries-in-python

标签

python

list

for-loop

deep-copy