在Python中合并两个CSV文件

在Python中合并两个CSV文件

问题描述:

我有两个csv文件,我想从两者的合并中创建第三个csv.这是我的文件的外观:

I have two csv files and I want to create a third csv from the a merge of the two. Here's how my files look:

数字|状态
1213 |已关闭
4223 |打开
2311 |打开

Num | status
1213 | closed
4223 | open
2311 | open

和另一个文件具有此内容:

and another file has this:

数字|代码
1002 | 9822
1213 | 1891年
4223 | 0011

Num | code
1002 | 9822
1213 | 1891
4223 | 0011

因此,这是我试图遍历的小代码,但是它没有打印输出,并且添加了与正确的值相匹配的第三列.

So, here is my little code that I was trying to loop through but it does not print the output with the third column added matching the correct values.

def links():
    first = open('closed.csv')
    csv_file = csv.reader(first)

    second = open('links.csv')
    csv_file2 = csv.reader(second)

    for row in csv_file:  
        for secrow in csv_file2:                             
            if row[0] == secrow[0]:
                print row[0]+"," +row[1]+","+ secrow[0]
                time.sleep(1)

所以我想要的是这样的:

so what I want is something like:

数字|状态|代码
1213 |关闭1891年
4223 |打开0011
2311 |打开空白不匹配

Num | status | code
1213 | closed | 1891
4223 | open | 0011
2311 | open | blank no match

问题是您只能在csv阅读器上进行一次迭代,因此csv_file2在第一次迭代后不起作用.为了解决这个问题,您应该保存csv_file2的输出并遍历保存的列表. 看起来可能像这样:

The problem is that you could iterate over a csv reader only once, so that csv_file2 does not work after the first iteration. To solve that you should save the output of csv_file2 and iterate over the saved list. It could look like that:

import time, csv


def links():
    first = open('closed.csv')
    csv_file = csv.reader(first, delimiter="|")


    second = open('links.csv')
    csv_file2 = csv.reader(second, delimiter="|")

    list=[]
    for row in csv_file2:
        list.append(row)


    for row in csv_file:
        match=False  
        for secrow in list:                             
            if row[0].replace(" ","") == secrow[0].replace(" ",""):
                print row[0] + "," + row[1] + "," + secrow[1]
                match=True
        if not match:
            print row[0] + "," + row[1] + ", blank no match" 
        time.sleep(1)

输出:

Num , status, code
1213 , closed, 1891
4223 , open, 0011
2311 , open, blank no match