在UNIX中用引号引起来的字符串中删除换行符
问题描述:
我有一个像这样的文本文件:
I have a text file that goes like this:
abc 123 xyz
"abc
123" xyz
如果新行出现在带引号的字符串中,我想用空格('')替换新行.所以我要输出:
I want to replace new lines with a space (' ') if the new line occurs within a quoted string. So I want output:
abc 123 xyz
abc 123 xyz
有没有办法在Unix中为此编写程序?
Is there a way to write a program in Unix for this?
答
您可以打印一个新行,也可以仅打印一个空格,具体取决于到目前为止有多少"
.这样,如果我们要用引号将新行打印出来.
You can print a new line or just a space depending on how many "
how got so far. This way, new line will just be printed if we are closing quotes.
$ awk '{n=split($0,a,"\""); val+=(n-1); gsub("\"",""); printf "%s%s", $0, (val%2?" ":"\n")}' file
abc 123 xyz
abc 123 xyz
说明
-
n=split($0,a,"\"")
计算当前行中出现的"
个.当split()
返回基于"
作为定界符的段数时,至少我们获得了1
的值. -
val+=(n-1)
跟踪余额.-1
仅计算报价数量,因为split返回的数量比需要的多. -
gsub("\"","")
删除字符串中的双引号. -
printf "%s%s", $0, (val%2?" ":"\n")
将行与空格或换行一起打印.如果val
为2的倍数,则换行;否则,为0.否则,空间. -
n=split($0,a,"\"")
count how many"
appear in the current line. Assplit()
returns how many pieces based on"
as delimiter, at least we get a value of1
. -
val+=(n-1)
keep track of the balance.-1
to just count the number of quotes, as split returns one more than needed. -
gsub("\"","")
remove double quotes in the string. -
printf "%s%s", $0, (val%2?" ":"\n")
print the line together with a space or a new line. Ifval
is multiple of 2, new line; otherwise, space.
Explanation
另一个例子:
$ cat a
abc 123 xyz
"abc
hee
123" xyz
and "this
is not everything"
$ awk '{n=split($0,a,"\""); val+=(n-1); gsub("\"",""); printf "%s%s", $0, (val%2?" ":"\n")}' a
abc 123 xyz
abc hee 123 xyz
and this is not everything