PHP Regex使用','除了','内引号和双引号来爆炸字符串

PHP Regex使用','除了','内引号和双引号来爆炸字符串

问题描述:

How to do explode except using ',' but not inside quote and double quote?

This is the string i want to explode:

`ot_request_id` int(11) NOT NULL,`ot_hours` int(11) NOT NULL,`ot_timelog_id` int(11) NOT NULL,`ot_user_id` int(11) NOT NULL,`ot_filing_date` datetime NOT NULL,`ot_approveby_id` int(11) NOT NULL,`final_approved` int(11) NOT NULL DEFAULT '0' , `ot_token` varchar(100) DEFAULT 'Your, name, here' NOT NULL,`startTime` datetime NOT NULL,`endTime` datetime NOT NULL

I want the output to become

`ot_request_id` int(11) NOT NULL `ot_request_id` int(11) NOT NULL,
`ot_hours` int(11) NOT NULL ,
`ot_timelog_id` int(11) NOT NULL,
`ot_user_id` int(11) NOT NULL,
`ot_filing_date` datetime NOT NULL,
`ot_approveby_id` int(11) NOT NULL,
`final_approved` int(11) NOT NULL DEFAULT '0' ,
`ot_token` varchar(100) DEFAULT 'Your, name, here' NOT NULL,
`startTime` datetime NOT NULL,
`endTime` datetime NOT NULL

here is my code:

$array = explode(",",$str);// change to Regex
foreach($array as $m){
    echo "<br>$m";
}

Use this regex:

(?<=,)(?=(?:(?:[^`'"]*[`'"]){2})*[^`'"]*$)

It covers ' " and ` as quote

Regex demo

Sample Source ( run here )

$re = '/(?<=,)(?=(?:(?:[^`\'"]*[`\'"]){2})*[^`\'"]*$)/m';

$str = '`ot_request_id` int(11) NOT NULL `ot_request_id` int(11) NOT NULL,`ot_hours` int(11) NOT NULL,`ot_timelog_id` int(11) NOT NULL,`ot_user_id` int(11) NOT NULL,`ot_filing_date` datetime NOT NULL,`ot_approveby_id` int(11) NOT NULL,`final_approved` int(11) NOT NULL DEFAULT \'0\' ,
`ot_token` varchar(100) DEFAULT \'Your, name, here\' NOT NULL,`startTime` datetime NOT NULL,`endTime` datetime NOT NULL';


$keywords = preg_split($re,$str);
print_r($keywords);

It will output:

Array ( [0] => `ot_request_id` int(11) NOT NULL `ot_request_id` int(11) NOT NULL, [1] => `ot_hours` int(11) NOT NULL, [2] => `ot_timelog_id` int(11) NOT NULL, [3] => `ot_user_id` int(11) NOT NULL, [4] => `ot_filing_date` datetime NOT NULL, [5] => `ot_approveby_id` int(11) NOT NULL, [6] => `final_approved` int(11) NOT NULL DEFAULT '0' , [7] => `ot_token` varchar(100) DEFAULT 'Your, name, here' NOT NULL, [8] => `startTime` datetime NOT NULL, [9] => `endTime` datetime NOT NULL )

I think you're safe assuming ",`" as the delimiter, without too much hassle:

$array = explode(",`",$str);
foreach($array as $m)
    echo "<br>`$m";

Note I added one ` in the echo now.

It's a matter of identifying the correct pattern-matching logic to achieve the desired result.

Using REGEX (explode() does not use REGEX - it splits on a string token), we can target only those commas which are proceeded by `.

$array = preg_split('/,(?= ?`)/', $str);

Note we also allow for an optional space between , and `.

A bit different from your question about "between quotes": it seems de comma to break on, is always before a backtic (or white space followed by backtic)

<?php
$s = "`ot_request_id` int(11) NOT NULL,`ot_hours` int(11) NOT NULL,`ot_timelog_id` int(11) NOT NULL,`ot_user_id` int(11) NOT NULL,`ot_filing_date` datetime NOT NULL,`ot_approveby_id` int(11) NOT NULL,`final_approved` int(11) NOT NULL DEFAULT '0' , `ot_token` varchar(100) DEFAULT 'Your, name, here' NOT NULL,`startTime` datetime NOT NULL,`endTime` datetime NOT NULL";

$b = preg_replace('#,\s*`#', '[BR]`', $s);

$answer = explode('[BR]', $b);
print_r($answer);

would be easier when the whitespace would not be there (at ot_token)

note: preg_split() would avoid the extra step inserting a [BR] only to explode on it the next line: better use preg_split as already answered

Although answered and accepted (because I worked on it for some time) I'll submit my answer, since it handles some things the accepted answer doesn't.

Using

(?<=,)(?!\R)(?=(?:[^'"]*(['"])(?:\\.|(?!\1).)*\1)*[^'"]*$)

to explode your string should do it. It fi

It Matches the position after a comma, that isn't followed by a linefeed (the negative look-ahead (?!\R)), and is followed by

  1. a sequence of zero or more characters that doesn't contain a quote

    [^'"]*

  2. a quote character that is captured

    (['"])

  3. either an escaped character

    \\.

    or a character that doesn't match the captured quote, tested with a negative look-ahead.

    (?!\1).

    1 to 3 are in a non capturing group that can repeat any number of times (including zero).

  4. finally it's ended by a sequence of zero or more characters that doesn't contain a quote. $ (matches end of line) to make sure the whole string is matched.

    [^'"]*$

See it here at regex101.

Note that this solution assumes opening/closing quotes are balanced in the string (unless in a quoted string).

The things handled that aren't in the accepted answer is

  1. single quoted strings inside a double quoted string, and vice versa.

    e.g. "It's a solution"

  2. escaped characters in a string

    e.g. 'It\'s a solution'

This may be completely unnecessary, but it was in my mindset when I started fiddling with this, so :)