使用awk合并多个文本文件

问题描述：

我有一些详细的统计信息保存在文本文件中，分别命名为1min.txt，2min.txt等.

I have some minutely stats saved in text files and named as 1min.txt, 2min.txt etc.

1min.txt
F1,21
F2,32
F3,22

2min.txt
F2,12
F4,32

我想以以下格式组合这些文件:

I would like to combine these files in the following format:

combined.txt
Field   1min    2min
F1      21      0
F2      32      12
F3      22      0
F4      0       32

某些文件中可能没有某些字段，这些字段将输入0.

Some fields may not exist in some files and 0 will be entered for those fields.

我尝试使用awk进行操作，但找不到简单的方法.有人可以帮忙吗?

I've tried to do it using awk but couldn't find an easy way. Can someone please help?

谢谢

答

使用awk:

awk -F, '
!seen[FILENAME]++ { 
    fname[++numFile] = FILENAME 
}
{
    flds[$1]++;
    map[FILENAME,$1] = $2
}
END {
    printf "%-10s", "FIELD";
    for (cnt=1; cnt<=numFile; cnt++) {
        file = fname[cnt];
        sub (/.txt/, "", file);
        printf "%-10s", file; 
    }
    print ""; 
    for (fld in flds) { 
        printf "%-10s", fld; 
        for (cnt=1; cnt<=numFile; cnt++) {
            printf "%-10s", map[fname[cnt],fld]+0
        }
        print "";
    }
}' 1min.txt 2min.txt

输出:

FIELD      1min       2min      
F1         21         0         
F2         32         12        
F3         22         0         
F4         0          32

查看输出后，可以将输出重定向到另一个文件.您可以根据需要在末尾传递任意数量的文件.如果您有太多方法，那么甚至可以使用shell glob，例如:*.txt

Once you have reviewed the output, you can re-direct the output to another file. You can pass as many files at the end as you want. If you have way too many then you can even use shell glob, for eg: *.txt

注意:由于字段并不总是出现在所有文件中，因此我不能保证字段的顺序.

Note: I haven't guaranteed the order of fields since they are not always present in all files.

这是一个很有趣的perl japh，它会做同样的事情:

Here is a pure fun perl japh that will do the same:

perl -F, -lane'
$f{$ARGV}++; $h{$F[0]}
{$ARGV}=      $F[  1  ]
}{print       join"\t",
"FIELD",      map{s/.[tx]+
//x           ;$_}sort{$a
<=>$b}        keys%f;print
join"\n",    map{$f
=$_;         join
"\t",  $f,map
{$h{$f
}{$_}
//=0}
sort{$a
<=>$b}
keys%f
}sort
keys%h;
' *.txt

输出:

FIELD   1min    2min
F1      21      0
F2      32      12
F3      22      0
F4      0       32

使用awk合并多个文本文件

输出:

输出:

相关推荐