正则表达式分裂字符串,如果它包含wordS或单词[关闭]
Edit again to trying to make it more clear.
Wich php regex pattern will give me a match array containing always 2 values wich is the 2 part of a string splitted by the "wordA" or "wordB". If the string do not containt those word, simply return the string as the first array an null in the second array.
Exemple:
preg_match("pattern","foo wordA bar",$match), $match will contain array['foo', 'bar']
preg_match("pattern","foo wordB bar",$match), $match will contain array['foo', 'bar']
preg_match("pattern","foo bar test",$match), $match will contain array['foo bar test', null]
I know that $match first value is always the string so I just don't write it.
OLD question:
I need to split a one line address into part. I can't find a way to capture street part but dont include the APP or APT word if present and if present, capture the words after it.
For exemple:
"5847A, rue Principal APP A" should match: (5847, A, rue Principal,A)
"5847A, rue Prince Arthur APT 22" should match: (5847, A, rue Prince Arthur, 22)
"1111, Sherwood street" should match: (1111, , Sherwood street, )
I'm using PHP.
What I have so far is: /^(\d+)(.*), (.*)(?:APP|APT)(?:\s*(.*))?$/i
wich wook with exemple 1 and 2. If I try to make the alternative (APP|APT) optionnal by adding an ? after it, then the third match include the word APP or APT...
Any idea how to exclude the optionnal and alternative APP or APT word from match?
Thank you
EDIT:
I can simplify the problem: How can I regex a string so the match return the same string minus the word APP or APT if he is present in the middle of it.
再次编辑以尝试使其更清晰。 p>
Wich php regex pattern将给我一个匹配数组,该数组总是包含2个值,它们是由“wordA”或“wordB”分割的字符串的2部分。 如果字符串不包含这些字,只需将字符串作为第一个数组返回第二个数组中的空值。 p>
例如: p>
preg_match(“pattern”,“foo wordA bar”,$ match),$ match将包含数组['foo','bar'] preg_match(“pattern”,“foo wordB bar”,$ match),$ match将包含数组['foo','bar'] preg_match(“pattern”,“foo bar test”,$ match),$ match将包含数组['foo bar test',null] code> pre>我知道$ match first value总是字符串所以我就是不写它。 p>
OLD问题: p> \ n
我需要将一个行地址拆分为一部分。 我找不到捕获街道部分的方法,但是如果存在则不包括APP或APT单词,如果存在,则捕获它后面的单词。 / p>
例如: p>
“5847A,rue Principal APP A”应符合:(5847,A,rue Principal,A) p> \ n
“5847A,rue Prince Arthur APT 22”应匹配:(5847,A,rue Prince Arthur,22) p>
“1111,Sherwood street”应符合:( 1111,Sherwood st reet,) p>
我正在使用PHP。 p>
到目前为止我所拥有的是:
/ ^(\ d +)(。* ),(。*)(?:APP | APT)(?:\ s *(。*))?$ / i code> wich with example 1和2.如果我尝试做出替代方案(APP | APT)可选择添加? 之后,第三场比赛包括单词APP或APT ... p>
知道如何从匹配中排除选项和替代APP或APT单词吗? p> \ n
谢谢 p>
编辑: p>
我可以简化问题:如何对字符串进行正则表达式以使匹配返回相同的字符串减去 APP或APT这个词,如果他出现在它的中间。 p> div>
As @MadaraUchiha pointed out, it's a bad idea to run a regex on an address since they can be in any format.
If you know you have consistent addresses, then I guess you can use the regex:
^([0-9]+)([A-Z])?,\s(?:(.*?)\s(?:APP|APT)\s(.*)|(.*))$
And the replace:
$1,$2,$3$5,$4
Here's how it's performing.
It's pretty similar to yours (I changed few things) and added an or (|
) operator to address the second type of addresses without APP
or APT
.
If you want consistent number of matches, maybe this?
^([0-9]*)([A-Z]?),((?:(?!\sAPP|\sAPT).)*)(?:\sAPP|\sAPT)?(.*)$
for the "easy" version
var_dump(preg_replace ( "/ apt|app /i" , "" ,"5847A, rue Prince Arthur APT 22" ));
covers it
that outputs
5847A, rue Prince Arthur 22
the harder version you would need more context like why the commas appear like they do.
the hard version
([0-9]*)([a-z]?),(((?!app|apt).)*)(?:app|apt)?(.*)
seems to work on your test cases
I think this should work:
$pattern = "/\bAPP|APT\b/i";
$subject = "1111, Sherwood street";
echo preg_replace($pattern, "", $subject);