正则表达式分裂字符串，如果它包含wordS或单词[关闭]

问题描述：

Edit again to trying to make it more clear.

Wich php regex pattern will give me a match array containing always 2 values wich is the 2 part of a string splitted by the "wordA" or "wordB". If the string do not containt those word, simply return the string as the first array an null in the second array.

Exemple:

preg_match("pattern","foo wordA bar",$match), $match will contain array['foo', 'bar']
preg_match("pattern","foo wordB bar",$match), $match will contain array['foo', 'bar']
preg_match("pattern","foo bar test",$match), $match will contain array['foo bar test', null]

I know that $match first value is always the string so I just don't write it.

OLD question:

I need to split a one line address into part. I can't find a way to capture street part but dont include the APP or APT word if present and if present, capture the words after it.

For exemple:

"5847A, rue Principal APP A" should match: (5847, A, rue Principal,A)

"5847A, rue Prince Arthur APT 22" should match: (5847, A, rue Prince Arthur, 22)

"1111, Sherwood street" should match: (1111, , Sherwood street, )

I'm using PHP.

What I have so far is: /^(\d+)(.*), (.*)(?:APP|APT)(?:\s*(.*))?$/i wich wook with exemple 1 and 2. If I try to make the alternative (APP|APT) optionnal by adding an ? after it, then the third match include the word APP or APT...

Any idea how to exclude the optionnal and alternative APP or APT word from match?

Thank you

EDIT:

I can simplify the problem: How can I regex a string so the match return the same string minus the word APP or APT if he is present in the middle of it.

再次编辑以尝试使其更清晰。 p>

Wich php regex pattern将给我一个匹配数组，该数组总是包含2个值，它们是由“wordA”或“wordB”分割的字符串的2部分。如果字符串不包含这些字，只需将字符串作为第一个数组返回第二个数组中的空值。 p>

例如： p>

preg_match（“pattern”，“foo wordA bar”，$ match），$ match将包含数组['foo'，'bar'] preg_match（“pattern”，“foo wordB bar”，$ match），$ match将包含数组['foo'，'bar'] preg_match（“pattern”，“foo bar test”，$ match），$ match将包含数组['foo bar test'，null] code> pre>

我知道$ match first value总是字符串所以我就是不写它。 p>

OLD问题： p> \ n

我需要将一个行地址拆分为一部分。我找不到捕获街道部分的方法，但是如果存在则不包括APP或APT单词，如果存在，则捕获它后面的单词。 / p>

例如： p>

“5847A，rue Principal APP A”应符合：（5847，A，rue Principal，A） p> \ n

“5847A，rue Prince Arthur APT 22”应匹配：（5847，A，rue Prince Arthur，22） p>

“1111，Sherwood street”应符合:( 1111，Sherwood st reet，） p>

我正在使用PHP。 p>

到目前为止我所拥有的是：/ ^（\ d +）（。* ），（。*）（？：APP | APT）（？：\ s *（。*））？$ / i code> wich with example 1和2.如果我尝试做出替代方案（APP | APT）可选择添加？之后，第三场比赛包括单词APP或APT ... p>

知道如何从匹配中排除选项和替代APP或APT单词吗？ p> \ n

谢谢 p> 编辑： p> 我可以简化问题：如何对字符串进行正则表达式以使匹配返回相同的字符串减去 APP或APT这个词，如果他出现在它的中间。 p> div>

答

As @MadaraUchiha pointed out, it's a bad idea to run a regex on an address since they can be in any format.

If you know you have consistent addresses, then I guess you can use the regex:

^([0-9]+)([A-Z])?,\s(?:(.*?)\s(?:APP|APT)\s(.*)|(.*))$

And the replace:

$1,$2,$3$5,$4

Here's how it's performing.

It's pretty similar to yours (I changed few things) and added an or (|) operator to address the second type of addresses without APP or APT.

If you want consistent number of matches, maybe this?

^([0-9]*)([A-Z]?),((?:(?!\sAPP|\sAPT).)*)(?:\sAPP|\sAPT)?(.*)$

Regex101 example.

答

for the "easy" version

 var_dump(preg_replace ( "/ apt|app /i" , "" ,"5847A, rue Prince Arthur APT 22"  ));

covers it

that outputs

5847A, rue Prince Arthur 22

the harder version you would need more context like why the commas appear like they do.

the hard version

([0-9]*)([a-z]?),(((?!app|apt).)*)(?:app|apt)?(.*)

seems to work on your test cases

答

I think this should work:

$pattern = "/\bAPP|APT\b/i";
$subject = "1111, Sherwood street";
echo preg_replace($pattern, "", $subject);

正则表达式分裂字符串，如果它包含wordS或单词[关闭]

相关推荐