正则表达式noob需要一个模式来匹配特定的字符串模式

正则表达式noob需要一个模式来匹配特定的字符串模式

问题描述:

Hi I'm trying to match a search string so that if there is us zipcode is at the beginning of the string, return the matches of the zipcode as well as the rest of the string. This is what i have so far

$str = "90210 Beverly Hills, CA";
$res = preg_match('/^((\d{5})(-\d{4})?)\s+(.+?)$/', $str, $matches);

But when I print_r the matches it returns with an extra key for the space.

Array
(
    [0] => 90210 Beverly Hills, CA
    [1] => 90210
    [2] => 90210
    [3] =>
    [4] => Beverly Hills, CA
)

Is there anyway i can improve the pattern and return matches which is not an empty string? Is there a better pattern for this instance? Also if just a zipcode is given or just a text string, it would return false.

The answer is no. A visual inspection of your pattern parentheses indicates that the fourth match would be the -1234 (four extra zip digits -- no idea what they're formally called), and in your example, that is empty. It is a bit unclear from your question what you want to actually capture. I will leave with a construct you may find useful:

(?: ...)

If you have ?: after a paren, it indicates that a capture should not be made. You can add this before the "dash-four-digit" subpattern and it will not be added to matches.

Your match group 3 is not a space. It is empty (or perhaps undefined or something - I'm not deeply involved with PHP). The empty value represents that there was no match to the optional group (-\d{4}) (US ZIP+4). I imagine you will want to keep your expression as it is so that you can easily detect the case that no ZIP+4 is present.