将CSV文件合并到一个文件中

问题描述:

我有一组csv文件,我想将它们合并到一个csv文件中. ,这需要一些时间,但我在目标路径中找不到文件

i have a set of csv files , i want to merge them in one csv file. ,it take some times ,but i don't find the file in the destination path

hdfs dfs -getmerge /DATA /data1/result.csv

任何帮助 谢谢

getmerge

getmerge

用法: hadoop fs -getmerge [-nl] <src> <localdst>

将源目录和目标文件作为输入,并将src中的文件串联到目标本地文件中.可以选择将-nl设置为允许在每个文件的末尾添加换行符(LF). --skip-empty-file可用于在文件为空的情况下避免使用不需要的换行符.

Takes a source directory and a destination file as input and concatenates files in src into the destination local file. Optionally -nl can be set to enable adding a newline character (LF) at the end of each file. --skip-empty-file can be used to avoid unwanted newline characters in case of empty files.

示例:

 hadoop fs -getmerge -nl /src /opt/output.txt

 hadoop fs -getmerge -nl /src/file1.txt /src/file2.txt /output.txt

退出代码:

成功返回0,错误返回非零.

Returns 0 on success and non-zero on error.

如果有些方法对您不起作用

If some how it does not work for you

您可以尝试使用cat命令,例如:(如果您的数据不够大)

You can try cat command like this: (If your Data is not large enough)

 hadoop dfs -cat /DATA/* > /<local_fs_dir>/result.csv

 hadoop dfs -copyFromLocal /<local_fs_dir>/result.csv /data1/result.csv