php正则表达式匹配shorttags

问题描述：

This is close, but is failing to match successive "attributes":

$string = "single attribute [include file=\"bob.txt\"] multiple attributes [another prop=\"val\" attr=\"one\"] no attributes [tag] etc";
preg_match_all('/\[((\w+)((\s(\w+)="([^"]+)"))*)\]/', $string, $matches, PREG_SET_ORDER);
print '<pre>' . print_r($matches, TRUE) . '</pre>';

Gives back the following:

Array
(
    [0] => Array
        (
            [0] => [include file="bob.txt"]
            [1] => include file="bob.txt"
            [2] => include
            [3] =>  file="bob.txt"
            [4] =>  file="bob.txt"
            [5] => file
            [6] => bob.txt
        )

    [1] => Array
        (
            [0] => [another prop="val" attr="one"]
            [1] => another prop="val" attr="one"
            [2] => another
            [3] =>  attr="one"
            [4] =>  attr="one"
            [5] => attr
            [6] => one
        )

    [2] => Array
        (
            [0] => [tag]
            [1] => tag
            [2] => tag
        )

)

Where [2] is the tag name, [5] is the attribute name and [6] is the attribute value.

The failure is on the second node - it catches attr="one" but not prop="val"

TYIA.

(this is only meant for limited, controlled use - not broad distribution - so I don't need to worry about single quotes or escaped double quotes)

这是关闭的，但无法匹配连续的“属性”： p>

  $ string =“single attribute [include file = \”bob.txt \“]多个属性[另一个prop = \”val \“attr = \”one \“]没有属性[tag]等”; \  npreg_match_all（'/ \ [（（\ w +）（（\ s（\ w +）=“（[^”] +）“））*）\] /'，$ string，$ matches，PREG_SET_ORDER）; 
print'  ＆LT;预＆GT;”  .print_r（$ matches，TRUE）。'＆lt; / pre＆gt;'; 
  code>  pre> 
 
 收回以下内容： p> 
 
 
   Array 
（
 [0] =＆gt; Array 
（
 [0] =＆gt; [include file =“bob.txt”] 
 [1] =＆gt; include file =“bob  .txt“
 [2] =＆gt; include 
 [3] =＆gt; file =”bob.txt“
 [4] =＆gt; file =”bob.txt“
 [5] =＆gt;  file 
 [6] =＆gt; bob.txt 
）
 
 [1] =＆gt;数组
（
 [0] =＆gt; [another prop =“val”attr =“one”]  
 [1] =＆gt;另一个prop =“val”attr =“one”
 [2] =＆gt;另一个
 [3] =＆gt; attr =“one”
 [4] =＆gt; attr  =“one”
 [5] =＆gt; attr 
 [6] =＆gt;一个
）
 
 [2] =＆gt;数组
（
 [0] =＆gt; [tag]  
 [1] =＆gt;标签
 [2] =＆gt;标签
）
 
）
  code>  pre> 
 
 其中[2]是标签 name，[5]是属性名称，[6]是属性值。 p> 
 
 
失败是在t上 第二个节点 - 它捕获 attr =“one” code>但不是 prop =“val” code>  p> 
 
 
 TYIA。 p> \  n 
 

（这仅限于有限的受控使用 - 不是广泛分布 - 所以我不需要担心单引号或转义双引号） p> 
  div>

答

Unfortunately there is no way to repeat capture groups like that. Personally, I would use preg_match to match the tags themselves (i.e. remove all the extra parentheses inside the regex), then foreach match you can then extract the attributes. Something like this:

$string = "single attribute [include file=\"bob.txt\"] multiple attributes [another prop=\"val\" attr=\"one\"] no attributes [tag] etc";
preg_match_all('/\[\w+(?:\s\w+="[^"]+")*\]/', $string, $matches);
foreach($matches[0] as $m) {
    preg_match('/^\w+/', $m, $tagname); $tagname = $tagname[0];
    preg_match_all('/\s(\w+)="([^"]+)"/', $m, $attrs, PREG_SET_ORDER);
    // do something with $tagname and $attrs
}

Note that if you intend to replace the tag with some content, you should use preg_replace_callback like so:

$string = "single attribute [include file=\"bob.txt\"] multiple attributes [another prop=\"val\" attr=\"one\"] no attributes [tag] etc";
$output = preg_replace_callback('/\[\w+(?:\s\w+="[^"]+")*\]/', $string, function($match) {
    preg_match('/^\w+/', $m, $tagname); $tagname = $tagname[0];
    preg_match_all('/\s(\w+)="([^"]+)"/', $m, $attrs, PREG_SET_ORDER);
    $result = // do something with $tagname and $attrs
    return $result;
});

php正则表达式匹配shorttags

相关推荐