将特定文件从多个子目录复制到R中的单个文件夹中
假设我有3个文件夹,每个文件夹中都有大量文件,我只想从每个子目录中选择几个文件,然后仅将这些文件粘贴到新文件夹中. 我们将其称为3个文件夹:
Assuming I have 3 folders with a large number of files in each, I want to select only a few files from each sub-directory and paste only those files into a new folder. Let's call the 3 folders:
- desktop/dir/sub_11s_gi01_ab
- desktop/dir/sub_11f_gi01_b
- desktop/dir/sub_12s_gi02_ms
需要复制的文件具有扩展名".wang.tax.sum"
The files that need to be copied have the extension ".wang.tax.sum"
所有其他文件都无法复制然后删除,因为这可能需要几天的时间.
All of the other files cannot be copied and then deleted because it would take days.
从其他问题来看,我可以将所有文件合并到一个列表中并复制所有文件,但是我不知道如何仅复制以.wang.tax.sum结尾的文件
我可以使用grep函数获取要传输的文件的列表,但不确定如何将子目录中的文件列表复制到新文件夹中.
到目前为止,这是我无法使用的内容.
From other questions, I can combine all the files into a list and copy all of them but I don't know how to copy only the files that end with .wang.tax.sum
I can use the grep function to get a list of the files that I want to transfer, but not sure how to copy that list of files in their sub-directories to a new folder.
Here's what I have so far, that does not work.
parent.folder <- "C:/Desktop/dir"
my_dirs <- list.files(path = parent.folder, full.names = T, recursive = T, include.dirs = T)
##this does not work##
a <- grep("wang.tax.sum",my_dirs)
my_dirs <- my_dirs[a]
files <- sapply(my_dirs, list.files, full.names = T)
dir.create("taxsum", recursive = T)
for(file in files) {
file.copy(file, "taxsum")
}
我知道grep在这里不起作用,但是我不确定如何创建一个仅选择所需文件并将其复制到单个文件夹的函数.我总共大约有50个子文件夹,每个子文件夹包含大约1gb的数据,因此再次复制所有数据,然后删除我不需要的子文件夹是不可行的.任何帮助将不胜感激
I know that the grep is not working here, but I'm not sure how to create a function that only selects the files I want and copy them to a single folder. I have roughly 50 sub-folders in total each having about 1gb of data, so again, copying all the data and then deleting what I don't want is not an option. Any help is greatly appreciated
parent.folder <- "C:/Desktop/dir"
files <- list.files(path = parent.folder, full.names = T, recursive = T, include.dirs = T)
此后,您需要选择相关文件:
After this you need to select the relevant files:
files <- files[grep("wang\\.tax\\.sum", files)]
(注意点号前面的双转义符:\\.
-点对grep具有特殊含义.)
(Notice double-escapes before dots: \\.
- dot has a special meaning for grep.)
或者您可以使用pattern
参数将list.files一步完成:
Or you could do this with pattern
argument to list.files in one step:
files <- list.files(path = parent.folder, full.names = T, recursive = T, include.dirs = T, pattern = "wang\\.tax\\.sum")
创建新目录:
dir.create("taxsum", recursive = T)
现在您需要创建新的文件名:
Now you need to create new filenames:
newnames <- paste0("taxsum/", gsub("/|:", "_", files))
# replace "special" characters with underscore
# so that your file names will be different and contain the
# original path
# alternatively, if you know that file names will be different:
newnames <- paste0("taxsum/", basename(files))
现在您可以使用mapply
进行复制(可以使用for
进行相同的操作,但需要付出一些额外的努力):
And now you can use mapply
to copy (the same can be done with for
with a little extra effort):
mapply(file.copy, from=files, to=newnames)