在UNIX中用引号引起来的字符串中删除换行符

在UNIX中用引号引起来的字符串中删除换行符

问题描述:

我有一个像这样的文本文件:

I have a text file that goes like this:

abc 123 xyz
"abc
123" xyz

如果新行出现在带引号的字符串中,我想用空格('')替换新行.所以我要输出:

I want to replace new lines with a space (' ') if the new line occurs within a quoted string. So I want output:

abc 123 xyz
abc 123 xyz

有没有办法在Unix中为此编写程序?

Is there a way to write a program in Unix for this?

您可以打印一个新行,也可以仅打印一个空格,具体取决于到目前为止有多少".这样,如果我们要用引号将新行打印出来.

You can print a new line or just a space depending on how many " how got so far. This way, new line will just be printed if we are closing quotes.

$ awk '{n=split($0,a,"\""); val+=(n-1); gsub("\"",""); printf "%s%s", $0, (val%2?" ":"\n")}' file
abc 123 xyz
abc 123 xyz

说明

  • n=split($0,a,"\"")计算当前行中出现的"个.当split()返回基于"作为定界符的段数时,至少我们获得了1的值.
  • val+=(n-1)跟踪余额. -1仅计算报价数量,因为split返回的数量比需要的多.
  • gsub("\"","")删除字符串中的双引号.
  • printf "%s%s", $0, (val%2?" ":"\n")将行与空格或换行一起打印.如果val为2的倍数,则换行;否则,为0.否则,空间.
  • Explanation

    • n=split($0,a,"\"") count how many " appear in the current line. As split() returns how many pieces based on " as delimiter, at least we get a value of 1.
    • val+=(n-1) keep track of the balance. -1 to just count the number of quotes, as split returns one more than needed.
    • gsub("\"","") remove double quotes in the string.
    • printf "%s%s", $0, (val%2?" ":"\n") print the line together with a space or a new line. If val is multiple of 2, new line; otherwise, space.
    • 另一个例子:

$ cat a
abc 123 xyz
"abc
hee
123" xyz
and "this
is not everything"
$ awk '{n=split($0,a,"\""); val+=(n-1); gsub("\"",""); printf "%s%s", $0, (val%2?" ":"\n")}' a
abc 123 xyz
abc hee 123 xyz
and this is not everything