如何动态构建Perl正则表达式?
我有一个Perl脚本,使用File :: Next ::文件遍历目录层次结构。它将只返回以.avi,.flv,.mp3,.mp4和.wmv结尾的脚本文件。它还将跳过以下子目录:.svn和以.frames结尾的任何子目录。这在 file_filter
和 descend_filter
子程序中指定。
I have a Perl script that traverses a directory hierarchy using File::Next::files. It will only return to the script files that end in ".avi", ".flv", ".mp3", ".mp4", and ".wmv." Also it will skip the following sub directories: ".svn" and any sub directory that ends in ".frames." This is specified in the file_filter
and descend_filter
subroutines below.
my $iter = File::Next::files(
{ file_filter => \&file_filter, descend_filter => \&descend_filter },
$directory );
sub file_filter {
# Called from File::Next:files.
# Only select video files that end with the following extensions.
/.(avi|flv|mp3|mp4|wmv)$/
}
sub descend_filter {
# Called from File::Next:files.
# Skip subfolders that either end in ".frames" or are named the following:
$File::Next::dir !~ /.frames$|^.svn$/
}
我想做的是在配置文件中放置允许的文件扩展名和不允许的子目录名他们可以即时更新。
What I want to do is place the allowed file extensions and disallowed sub directory names in a configuration file so they can be updated on the fly.
我想知道的是如何编写子程序来构建基于配置文件中的参数的正则表达式构造?
What I want to know is how do I code the subroutines to build regex constructs based on the parameters in the configuration file?
/.(avi|flv|mp3|mp4|wmv)$/
$File::Next::dir !~ /.frames$|^.svn$/
假设您已解析配置文件以获取扩展列表和忽略的目录,您可以将正则表达式构建为字符串,然后使用 qr
运算符将其编译为正则表达式:
Assuming that you've parsed the configuration file to get a list of extensions and ignored directories, you can build the regular expression as a string and then use the qr
operator to compile it into a regular expression:
my @extensions = qw(avi flv mp3 mp4 wmv); # parsed from file
my $pattern = '\.(' . join('|', @wanted) . ')$';
my $regex = qr/$pattern/;
if ($file =~ $regex) {
# do something
}
编译不是绝对必要的;您可以直接使用字符串模式:
The compilation isn't strictly necessary; you can use the string pattern directly:
if ($file =~ /$pattern/) {
# do something
}
目录有点困难,因为您有两种不同的情况:全名和后缀。您的配置文件将必须使用不同的键,以清楚哪个是哪个。例如dir_name和dir_suffix。对于全名我只是建立一个哈希:
Directories are a little harder because you have two different situations: full names and suffixes. Your configuration file will have to use different keys to make it clear which is which. e.g. "dir_name" and "dir_suffix." For full names I'd just build a hash:
%ignore = ('.svn' => 1);
后缀目录可以使用与文件扩展名相同的方式:
Suffixed directories can be done the same way as file extensions:
my $dir_pattern = '(?:' . join('|', map {quotemeta} @dir_suffix), ')$';
my $dir_regex = qr/$dir_pattern/;
您甚至可以将模式构建为匿名子例程,以避免引用全局变量:
You could even build the patterns into anonymous subroutines to avoid referencing global variables:
my $file_filter = sub { $_ =~ $regex };
my $descend_filter = sub {
! $ignore{$File::Next::dir} &&
! $File::Next::dir =~ $dir_regex;
};
my $iter = File::Next::files({
file_filter => $file_filter,
descend_filter => $descend_filter,
}, $directory);