如何使用正则表达式搜索字符串是否包含列表中的至少一个单词?

问题描述:

I have a string and need to check if any of the words in my list are in the string.
My list looks like this:

$keywords = array(
    "l.*ion",
    "test",
    'one',
    'two',
    'three'
);
  1. If I have string This is my lion then I need to return true.
  2. If I have string This is my lotion then I need to return true.
  3. If I have string This is my dandelion then return false.
  4. If I have string This is my location then return true.
  5. If I have string This is my test then return true.
  6. If I have string This is my testing then return false.

This is my code:

$keywords = implode($keywords,"|");
$list= "/\b$keywords\b/i";
$my_string= "This is my testing";
preg_match($list, $my_string, $matches, PREG_OFFSET_CAPTURE);
echo $matches[0][1];

But when I do This is my testing it returns a value.
What am I doing wrong? I'm expecting a numerical value if its true and and error if its false.

我有一个字符串,需要检查列表中的任何单词是否在字符串中。
我的列表如下所示: p>

  $ keywords = array(
“l。* ion”,
“test”,
'one',\  n'两个',
'三'
); 
  code>  pre> 
 
 
  1. 如果我有字符串这是我的狮子 code >然后我需要返回 true code>。
    li>
  2. 如果我有字符串这是我的乳液 code>那么我需要返回 true code>。
    li>
  3. 如果我有字符串这是我的蒲公英 code>然后返回 false code>。
    li >
  4. 如果我有字符串这是我的位置 code>然后返回 true code>。
    li>
  5. 如果我有字符串这是我的测试 code>然后返回 true code>。
    li>
  6. 如果我有字符串这是我的测试 code>然后返回 false code>。
    li> ol>

    这是我的代码: p>

      $ keywords =  implode($ keywords,“|”); 
     $ list =“/ \ b $ keywords \ b / i”; 
     $ my_string =“这是我的测试”; 
     npreg_match($ list,$ my_string,$ matches)  ,PREG_OFFSET_CAPTURE); 
    echo $ matches [0] [1]; 
      code>  pre> 
     
     

    但是 我做这是我的测试 code>它返回一个值。
    我做错了什么? 我期待一个数值,如果它是真的,如果它是假的,那就是错误。 p> div>

In your current regex, \bl.*ion|test|one|two|three\b, the first \b only affects the first alternative and the last \b only affects the last alternative.

Besides, since you want to only restrict matching of keywords to a single word, you cannot rely on .* pattern as . matches any char but a line break char.

You should use either \S* (to match 0+ non-whitespace chars, that also include punctuation) or \w* (to match 0+ letters, digits, and _).

So, you need to do two things: 1) redefine the $keywords array and 2) use a grouping construct around the alternatives when implodeing to group the alternatives so that the first and last \b could be applied to each alternative.

$keywords = array(
    "l\w*ion",     // <-- Here, a `\w` is used instead of .
    "test",
    'one',
    'two',
    'three'
);

$list= "/\b(?:" . implode($keywords,"|") . ")\b/i"; // <-- Here, the (?:...) groups alternatives
$my_string= "This is my testing";
if (preg_match($list, $my_string, $matches, PREG_OFFSET_CAPTURE)) {
  echo $matches[0][1];
}

See the PHP demo.

Now, the pattern is \b(?:l\w*ion|test|one|two|three)\b and \bs apply to all the alternatives. See this regex demo.