使用R,人们如何计算PDF文件中的页数?
我在一个目录中有大约一百个PDF文件,并想知道R是否可以计算每个文件中有多少页。我的操作系统是Windows 8。
I have about a hundred long PDF files in a directory and would like to know whether R can count how many pages are in each file. My operating system is Windows 8.
这里是一个10页PDF文件的链接,以防这可以帮助您测试您的解决方案。 MWE pdf文件
Here is the link to a 10-page PDF file, in case this helps you test your solution. MWE pdf file
这似乎可能用python计数PDF页面,但我不知道该语言 python解决方案。已经使用例如Imagemagick在SO上讨论了其他解决方案。和C ##。
It appears to be possible to count PDF pages with python, but I don't know that language python solution. Other solutions have been discussed on SO using, e.g., Imagemagick. and C##.
我在Windows 7机器上工作,但是我在Windows 8上的经验让我认为
I'm working on a Windows 7 machine, but my experiences on Windows 8 make me think it should work just as well for you.
我无法编译 Rpoppler
包和hrbrmstr点出来,这可能不值得战斗。如果你有7-Zip,你可以提取poppler工具为Windows。我把它们提取到 C:\poppler
的位置。一旦存在,我可以执行以下操作:
I wasn't able to compile the Rpoppler
package, and as hrbrmstr points out, it's probably not worth fighting. If you have 7-Zip, you can extract the poppler tools for Windows. I've extracted them to the location C:\poppler
. Once there, I can do the following
file_name <- "C:/[file_path]/whitepaper-pdfprimer.pdf"
pdf_pages <- function(file_name){
require(magrittr)
pages <- system2("C:/poppler/bin/pdfinfo.exe",
args = file_name,
stdout = TRUE)
pages[grepl("Pages:", pages)] %>%
gsub("Pages:", "", .) %>%
as.numeric()
}
pdf_pages(file_name)
如果你有一个文件名的向量你想传递
And if you have a vector of file names you want to pass
vapply(file_names, pdf_pages, numeric(1))
输入@hrbrmstr指出poppler工具从来没有听说过他们直到今天)。
Credit to @hrbrmstr for pointing out the poppler tools (I'd never heard of them until today).