空正则表达式匹配任何字符串

空正则表达式匹配任何字符串

问题描述:

I was trying to find a regex that matches any string! and after some search I found almost all the answers says that [\s\S] will match any string as said here or .* as said here

But while playing a bit with PHP preg_match I found that an empty regex is matching any string!

if(preg_match("//u", "")) echo "empty string matchs
";
else echo "empty string does not match
";

if(preg_match("//u", "abc")) echo "abc matchs
";
else echo "abc does not match
";

if(preg_match("//u", "
")) echo "new line matchs
";
else echo "new line does not match
";

if(preg_match("//u", "/")) echo "/ matchs
";
else echo "/ does not match
";

exit;

this will output

empty string matchs
abc matchs
new line matchs
/ matchs

live demo (https://eval.in/845001)

Can I use this empty regex safely to match anything ? and what does an empty regex mean ?

If you are asking why would I need a regex that matches anything, that is because I'm using a function that requires a regex parameter as part of it's string validation functionality and I want it to accept anything.

我试图找到一个匹配任何字符串的正则表达式! 经过一些搜索后,我发现几乎所有的答案都说 [\ s \ S] code>会匹配任何字符串所述此处。* code> 这里 p>

但是在玩PHP的时候 preg_match code> 我发现空正则表达式匹配任何字符串! p>

  if(preg_match(“// u)  “,”“))echo”空字符串匹配
“; 
else echo”空字符串不匹配
“; 
 
if(preg_match(”// u“,”abc“))echo”abc matchs  
“; 
else echo”abc不匹配
“; 
 
if(preg_match(”// u“,”
“))echo”new line matchs 
“; 
else echo”new line 不匹配
“; 
 
if(preg_match(”// u“,”/“))echo”/ matchs 
“; 
else echo”/不匹配
“; 
 
exit;  
  code>  pre> 
 
 

这将输出 p>

 空字符串匹配
abc m  atchs 
new line matchs 
 / matchs 
  code>  pre> 
 
 

live demo( https://eval.in/845001 ) p>

我可以安全地使用这个空的正则表达式来匹配任何东西吗? 空正则表达式意味着什么? strong> p>

如果你问为什么我需要一个匹配任何东西的正则表达式,那是因为我正在使用一个函数 需要一个正则表达式参数作为它的字符串验证功能的一部分,我希望它接受任何东西。 em> p> div>

An empty regex pattern // matches at start, end and any position between characters in a string. See this demo at eval.in preg_match_all('//', "foo", $out); which returns 4 empty matches:

Array[0] => [1] => [2] => [3] => )

As preg_match would just check for the first match it should be fine to use the empty pattern. However generally I'd probably prefer /^/ which matches start of the string that every string has.

[\s\S] (shorts for whitespaces together with non-whitespaces in a character class) means just any character and is usually used line-break related to also match newlines where there is no flag available for making the dot match linebreaks. Often used with JS regex which does not support s flag. Similar are [\D\d] (digits and non-digits), [\w\W] (word characters and non word characters). Also possible with JS regex is [^] a negated empty character class for "not nothing".

To use /[\s\S]/ or one of the others without quantifier will require at least one character.

Further to mention that in your patterns you use the u flag for unicode regex. There is probably no reason to use this flag together with an empty pattern or just checking for start of the string. Interesting with pcre unicode regex might be the following escape sequences.

Well, I don't really see why one would need a pattern to match any string, but wrote for interest :)

Yes you are correct. Another alternative is /.?/. There are many possibilities to accept all strings.