Python CSV: write rows according to dict mapping

问题

I have a dict that describes a mapping I want applied to every row in a CSV file.

dict1 = {"key1":["value1", "value2"], "key2":["value3"]}

My program should read one row and map the key in a specific column to the value(s) provided by the dict. If there's only one value per key, then the script should write to a new file the row containing the new value. If there are multiple values to a key, then there should be one new row written per value.

For example, csvin contains 2 rows. One row has a column in which key1 is present, and the other has key2. In this case, the output file csvout should contain more rows than csvin, in effect 3. Two of the rows (associated with key1) will be identical except for one single value.

My current script is this:

def convSan(sfin, cfout):
    with open(sfin, "rb") as fin:
        with open(cfout, "wb") as fout:
            csvin = csv.reader(fin)
            csvout = csv.writer(fout, delimiter=",")
            fline = csvin.next()
            csvout.writerow(fline)

        for row in csvin:
            row[25] = dict1[row[25]]
            csvout.writerow(row)

This produces an output file with the same number of columns as the input file, but populates every field with the correct new values (some fields are now lists of values).

The answer provided by @sr2222 works in the case of simple lists, but I cannot get it to work in my particular case.

Help is appreciated.

回答1:

First:

for index, value in enumerate(list1):
    list1[index] = list2[index]

Is a cleaner way to format your first loop. However, that is equivalent to list1 = copy.copy(list2). I think what you are trying to do is:

normalized_values = ['123', '456']
content = ['a123', '123', 'b456', '789']
for index, value in enumerate(content):
    for normalized_value in normalized_values:
        if normalized_value in value:
            content[index] = normalized_value

Which will leave you with:

content = ['123', '123', '456', '789']

Edit after question update:

replacement_map = {'123' : ('a123', '1234'), '456' : ('00456',)}
input = ['123', '456', '234', '123', '789']
output = []
for value in input:
    try:
        output.extend(replacement_map[value])
    except KeyError:
        output.append(value)

The try/except is equivalent to:

if value in replacement_map:
    output.extend(replacement_map[value])
else:
    output.append(value)

In response to comment on building the map from 2 lists as described above (note this will only behave correctly if you can always assume list1 and list2 are the same length):

replacement_map = {}
for key, value in zip(list1, list2):
    try:
        replacement_map[key].append(value)
    except KeyError:
        replacement_map[key] = [value]

回答2:

For the interested, I was able to make it work like this:

def convSan(sfin, cfout):
    with open(sfin, "rb") as fin:
        with open(cfout, "wb") as fout:
            csvin = csv.reader(fin)
            csvout = csv.writer(fout, delimiter=",")
            fline = csvin.next()
            csvout.writerow(fline)
            buff = []

            for row in csvin:
                dl = ce.dict1200[row[25]]
                if len(dl) == 1:
                    row[25] = dl[0]
                    csvout.writerow(row)
                else:
                    for i in range(len(dl)-1):
                        row[25] = dl[i]
                        csvout.writerow(row)

Conversion is successful, and as required, my input file contains less rows than my output file.

来源：https://stackoverflow.com/questions/11657471/python-csv-write-rows-according-to-dict-mapping

标签

python

csv

dictionary

mapping

rules