Merge two tables (CSV) if (table1 column A == table2 column A)

。_饼干妹妹 提交于 2019-12-19 04:12:33

问题


I have two CSV's, openable in Numbers or Excel, structured:
| word | num1 |
and
| word | num2 |

if the two words are equal (like they're both 'hi' and 'hi') I want it to become:
| word | num1 | num2 |

here are some pictures:

So like for row 1, since both the words are the same, 'TRUE', I want it to become something like
| TRUE | 5.371748 | 4.48957 |

Either through some small script, or if there's some feature/ function I'm overlooking.
Thanks!


回答1:


Use a dict:

with open('file1.csv', 'rb') as file_a, open('file2.csv', 'rb') as file_b:
    data_a = csv.reader(file_a)
    data_b = dict(csv.reader(file_b))  # <-- dict
    with open('out.csv', 'wb') as file_out:
        csv_out = csv.writer(file_out)
        for word, num_a in data_a:
            csv_out.writerow([word, num_a, data_b.get(word, '')])  # <-- edit

(untested)




回答2:


For csv, I always reach for the data analysis library pandas. http://pandas.pydata.org/

import pandas as pd

df1 = pd.read_csv('file1.csv', names=['word','num1'])
df2 = pd.read_csv('file2.csv', names=['word','num2'])
df3 = pd.merge(df1, df2, on='word')
df3.to_csv('merged_data.csv')



回答3:


I think what you're looking for is zip, to let you iterate the two CSVs in lock-step:

with open('file1.csv', 'rb') as f1, open('file2.csv', 'rb') as f2:
    r1, r2 = csv.reader(f1), csv.reader(f2)
    with open('out.csv', 'wb') as fout:
        w = csv.writer(fout)
        for row1, row2 in zip(r1, r2):
            if row1[0] == row2[0]:
                w.writerow([row1[0], row1[1], row2[1]])

I'm not sure what you wanted to happen if they're not equal. Maybe insert both rows, like this?

            else:
                w.writerow([row1[0], row1[1], ''])
                w.writerow([row2[0], '', row2[1]])


来源:https://stackoverflow.com/questions/21768722/merge-two-tables-csv-if-table1-column-a-table2-column-a

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!