从使用bash文件中提取不同的行

从使用bash文件中提取不同的行

问题描述:

我有两个文件,​​我用COMM -23文件1文件2命令,以提取是从一个文件到另一个不同的线条。

I have two files and I use the "comm -23 file1 file2" command to extract the lines that are different from a file to another.

我还需要的东西,提取不同的线路,而且preserves字符串line_ $ NR。
例:
文件1:

I would also need something that extracts the different lines but also preserves the string "line_$NR". Example: file1:

line_1: This is line0
line_2: This is line1
line_3: This is line2
line_4: This is line3

文件2:

line_1: This is line1
line_2: This is line2
line_3: This is line3

我需要这样的输出:
差异文件1文件2:

I need this output: differences file1 file2:

line_1: This is line0.

在最后,我需要提取,如果文件没有在一开始line_ $ NR的差异,但是当我打印出结果,我需要同时打印line_ $ NR。

In conclusion I need to extract the differences as if the file has not line_$NR at the beginning but when I print the result I need to also print line_$NR.

尝试使用 AWK

awk -F: 'NR==FNR {a[$2]; next} !($2 in a)' file2 file1

输出:

line_1: This is line0

简要说明

awk -F: '             # Set filed separator as ':'. $1 contains line_<n> and $2 contains 'This is line_<m>'
    NR==FNR {         # If Number of records equal to relative number of records, i.e. first file is being parsed
        a[$2];        # store $2 as a key in associative array 'a'
        next          # Don't process further. Go to next record.
    } 
    !($2 in a)        # Print a line if $2 of that line is not a key of array 'a'
' file2 file1


的附加要求在评论的)

如果我有多个:一条线:line_1:本:是:line0
  不工作。我怎么只能走line_x

And if I have multiple ":" in a line : "line_1: This :is: line0" doesn't work. How can I only take the line_x

在这种情况下,请尝试以下( GNU AWK 只)

In that case, try following (GNU awk only)

awk -F'line_[0-9]+:' 'NR==FNR {a[$2]; next} !($2 in a)' file2 file1