PHP正则表达式帮助 - 使用列表中较近的单词作为边界,而不是按列表顺序提供

PHP正则表达式帮助 - 使用列表中较近的单词作为边界,而不是按列表顺序提供

问题描述:

I have a problem with one regex expression to be used so i.e. the input string looks like

hello world and me or you

and I would like to match all from hello until the closest/nearest of the noisy words: and,or

so far I have come up with something like that:

preg_match_all("/^hello[A-Z0-9 -]*(or|and)/is",$string,$match);

but the problem is that it will return: hello world and me or instead of hello world and since the or is first in (or|and) list.

It would be really appreciated if anyone could tell me is there an option to tell regex engine to check which one is closer/nearer from the OR tokens list to match and used that one instead of checking the order as provided i.e. (or|and) in which case and should be used as its closer to initial pattern.

P.S. changing an order inside (or|and) is not a solution as there are more words and you never know which one is nearer so it must be done on the algorithmic level.

many thanks for your advices.

我遇到一个要使用的正则表达式的问题,即输入字符串看起来像 p>

你好世界和我或你 code> p>

我希望将所有来自你好的比赛直到最近/最近的嘈杂词语:和,或 c​​ode> p>

到目前为止,我提出了类似的结论: p>

preg_match_all(“/ ^ hello [A-Z0-9 - ] *(或|和)/是“,$ string,$ match); code> p>

但问题是它会返回:\ n hello world和me或 c​​ode>而不是 hello world和 code>,因为或 c​​ode>首先出现在 (或|和) 代码>列表。 p>

如果有人能告诉我是否有一个选项可以告诉正则表达式引擎检查哪一个更接近或接近OR标记列表以匹配并使用那个而不是 检查提供的顺序,即(或|和) code>,在这种情况下,应该用作更接近初始模式。 p>

PS 更改内部的订单(或|和) code>不是一个解决方案,因为有更多的单词,你永远不知道哪一个更近,所以它必须在算法级别上完成。 p>

许多 谢谢你的建议。 p> div>

The question mark after an asterisk (ie. /.*?/) tells the asterisked expression to be not greedy. So your RegExp should be /^hello[A-Z0-9 -]*?(or|and)/is or something similar.

Use (capturing) subpatterns:

preg_match_all("/^(hello[A-Z0-9 -]*)(or|and)/is",$string,$match);

and $match[0][1], $match[1][1], $match[2][1] ... will contain the values as you need 'em.