PowerShell脚本将双引号内的逗号替换为空
我有一个逗号分隔的 CSV 文件,我打算在其中将双引号中的逗号替换为空,并将双引号替换为空:
I have a comma separated CSV file, where I intend to replace the commas in double quotes to nothing and also replace double quotes with nothing:
编者注:此问题的原始形式要求将 [the] 分隔符更改为竖线" (|
),这不再是一项要求;gms0ulman 的答案是在它仍然存在的时候写的.
Editor's note: The original form of this question asked to "change [the] delimiter to pipe" (|
), which is no longer a requirement; gms0ulman's answer was written when it still was.
$inform = Get-Content C:\test.csv
$inform | % {
$info = $_.ToString().Replace(",","")
$var = $info
$var | Out-file C:\test1.csv -Append
}
任何帮助将不胜感激.
在:
1,2,"Test,ABC"
出:
1,2,TestABC
以下内容应该可以满足您的需求(在 PSv5.1 中测试过):
The following should do what you want (tested in PSv5.1):
Import-Csv C:\test.csv | ForEach-Object -Begin { $writeHeader = $True } {
if ($writeHeader) { $writeHeader = $False; $_.psobject.properties.Name -join ',' }
$_.psobject.properties.Value -replace ',', '' -join ','
} | Set-Content -Encoding UTF8 test1.csv
-
Import-Csv
将您的 CSV 文件读入自定义对象([pscustomobject]
实例),其属性包含删除双引号的列值.-
Import-Csv
reads your CSV file into custom objects ([pscustomobject]
instances) whose properties contain the column values with double quotes removed.- 由于列值随后存储在不同的属性中,因此列内部
,
实例可以盲目替换,而不必担心列分隔,
实例. - 自动去除封闭的双引号是一个有益的副作用,但必须注意不要在输出中重新引入它们 - 请继续阅读.
- Since the column values are then stored in distinct properties, column-internal
,
instances can therefore blindly replaced without worrying about column-separating,
instances. - That the enclosing double quotes were automatically stripped is a beneficial side effect, though care must be taken not to reintroduce them on output - read on.
问题是你不能在修改对象后使用
Export-Csv
,因为它总是加双引号(返回)围绕所有输出值.The problem is that you can not use
Export-Csv
after modifying the objects, because it invariably adds double quotes (back) around all output values.因此,必须使用
ForEach-Object
为每个自定义对象执行自定义迷你脚本:Therefore, a custom mini-script must be executed for each custom object, using
ForEach-Object
:-Begin { $writeHeader = $True }
在开始时执行一次,表示需要在第一个数据行之前输出标题行.
-Begin { $writeHeader = $True }
is executed once at the beginning to signal the need to output a header row before the first data row.
$_.psobject.properties
是输入对象上定义的所有属性的集合,以标题列命名,并包含给定数据行的值.$_.psobject.properties
is the collection of all properties defined on the input object, named for the header columns, and containing a given data row's values.$_.psobject.properties.Name -join ','
输出标题行,只需将属性名称 - 即列标题 - 与,
产生单个输出字符串.$_.psobject.properties.Name -join ','
outputs the header row, simply by joining the property names - which are the column headers - with,
to yield a single output string.$_.psobject.properties.Value -replace ',', ''
删除任何值内部的,
实例(用空字符串替换它们),并且-join ','
再次将结果值与,
连接起来,输出一个数据行.$_.psobject.properties.Value -replace ',', ''
removes any value-internal,
instances (replaces them with the empty string), and-join ','
again joins the resulting values as-is with,
, outputting a data row.Set-Content
- 这里比Out-File
更可取,因为输出对象已经是 strings - 被使用写入输出文件.Set-Content
- which is preferable toOut-File
here, because the output objects are already strings - is used to write to the output file.注意
-Encoding
参数来控制输出字符编码-根据需要进行调整.
Note the
-Encoding
parameter to control the output character encoding -adjust as needed.
在 Windows PowerShell(版本高达 v5.1)中,不使用
-Encoding
将默认为您系统的ANSI"编码.代码页(即使帮助主题声称是 ASCII),而Out-File
将默认为 UTF-16LE(Unicode").In Windows PowerShell (versions up to v5.1), not using
-Encoding
would default to your system's "ANSI" code page (even though the help topic claims ASCII), whereasOut-File
would default to UTF-16LE ("Unicode"). - 由于列值随后存储在不同的属性中,因此列内部
-