如何使用Linux将大型csv拆分为多个小型csv?
问题描述:
需要使用php和linux按行将大csv文件分割成多个文件.
Need to split large csv file into multiple files by lines using php and linux.
CSV包含-
"id","name","address"
"1","abc","this is test address1 which having multiple newline
separators."
"2","abc","this is test address2
which having multiple newline separators"
"3","abc","this is test address3.
which having multiple
newline separators."
我使用了linux comand-split -l 5000 testfile.
I used linux comand - split -l 5000 testfile.
但是它无法以正确的格式拆分csv,因为在csv中,一个字段地址具有多个换行符,因此请从该行中使用拆分文件命令.
But it can not able to split csv in correct format because in csv there is one field address having multiple newline characters so command with split file from that line.
我也尝试过使用PHP:
I've also tried to use PHP:
$inputFile = 'filename.csv';
$outputFile = "outputfile";
$splitSize = 5000;
$in = fopen($inputFile, 'r'):
$header = fgetcsv($in);
$rowCount = 0;
$fileCount = 1;
while (!feof($in)) {
if (($rowCount % $splitSize) == 0) {
if ($rowCount > 0) {
fclose($out);
}
$filename = $outputFile . $fileCount++;
$out = fopen($filename .'.csv', 'w');
chmod($filename,777);
fputcsv($out, $header);
}
$data = fgetcsv($in);
if ($data) {
fputcsv($out, $data);
$rowCount++;
}
}
fclose($out);
如何解决此问题?
答
使用Ruby:
ruby -e 'require "csv"
f = ARGV.shift
CSV.foreach(f).with_index{ |e, i|
File.write("#{f}.#{i}", CSV.generate_line(e, force_quotes: true))
}' file.csv
Php:
<?php
$inputFile = 'file.csv';
$outputFile = 'file.out';
$splitSize = 1;
if (($in = fopen($inputFile, 'r'))) {
$header = fgetcsv($in);
$rowCount = 0;
$fileCount = 0;
while (($data = fgetcsv($in))) {
if (($rowCount % $splitSize) == 0) {
if ($rowCount > 0) {
fclose($out);
}
$filename = $outputFile . ++$fileCount . '.csv';
$out = fopen($filename, 'w');
chmod($filename, 755);
fputcsv($out, $header);
}
fputcsv($out, $data);
$rowCount++;
}
fclose($out);
}
?>