如何使 Perl 的 File::Find 更快?
我有一个名为 Lib 的文件夹,我正在使用 File::Find 模块在整个目录中搜索该文件夹,例如 D:\
.搜索需要很长时间,如果驱动器有很多子目录,甚至需要 5 分钟.如何更快地搜索该 Lib,以便在几秒钟内完成?
I have a folder named Lib and I am using the File::Find module to search that folder in whole dir say, D:\
. It's taking a long time to search, say even 5 mins if the drive has a lot of subdirectories. How can I search that Lib faster so it will be done in seconds?
我的代码如下所示:
find( \&Lib_files, $dir);
sub Lib_files
{
return unless -d;
if ($_=~m/^([L|l]ib(.*))/)
{
print"$_";
}
return;
}
在没有预先存在的索引的情况下搜索文件系统是 IO 绑定的.否则,产品范围从 locate
到 Windows 桌面搜索 将不存在.
Searching the file system without a preexisting index is IO bound. Otherwise, products ranging from locate
to Windows Desktop Search would not exist.
输入 D:\>目录/b/s >directory.lst 并观察该命令运行所需的时间.如果不首先索引文件,您不应该期望击败它.
Type D:\> dir /b/s > directory.lst
and observe how long it takes for that command to run. You should not expect to beat that without indexing files first.
您可以做出的一项重大改进是减少打印频率.一个小改进是,如果您不打算捕获,则不要使用捕获括号:
One major improvement you can make is to print less often. A minor improvement is not to use capturing parentheses if you are not going to capture:
my @dirs;
sub Lib_files {
return unless -d $File::Find::name;
if ( /^[Ll]ib/ ) {
push @dirs, $File::Find::name;
}
return;
}
在我的系统上,一个简单的脚本使用 File::Find
打印我的主目录下所有子目录的名称,大约有 150,000 个文件,与 dir % 相比,运行需要几分钟时间首页%/ad/b/s >dir.lst
在大约 20 秒内完成.
On my system, a simple script using File::Find
to print the names of all subdirectories under my home directory with about 150,000 files takes a few minutes to run compared to dir %HOME% /ad/b/s > dir.lst
which completes in about 20 seconds.
我倾向于使用:
use File::Basename;
my @dirs = grep { fileparse($_) =~ /^[Ll]ib/ }
split /\n/, `dir %HOME% /ad/b/s`;
在我的系统上在 15 秒内完成.
which completed in under 15 seconds on my system.
如果%PATH%
中还有其他dir.exe
,cmd.exe
的内置dir
不会被调用.您可以使用 qx!cmd.exe/c dir %HOME%/ad/b/s !
以确保调用正确的 dir
.
If there is a chance there is some other dir.exe
in %PATH%
, cmd.exe
's built-in dir
will not be invoked. You can use qx! cmd.exe /c dir %HOME% /ad/b/s !
to make sure that the right dir
is invoked.