在两个分隔符之间提取字符串的最可靠方法

在两个分隔符之间提取字符串的最可靠方法

问题描述:

I've tried multiple functions to extract whatever between two strings, The delimiters might contain special characters, I guess that's why none worked for me.

My current function:

function between($str, $startTag, $endTag){
    $delimiter = '#';
    $regex = $delimiter . preg_quote($startTag, $delimiter) 
                        . '(.*?)' 
                        . preg_quote($endTag, $delimiter) 
                        . $delimiter 
                        . 's';
    preg_match($regex, $str, $matches);
    return $matches;
}

Example of string:

#{ST@RT}#
Text i want
#{END}#

#{ST@RT}#
Second text i want
#{END}#

How to improve that or suggest another solution to:

  • Support any kind of character or new lines
  • Extract multiple strings if found

Current Behavior: Only returns the first match, And also returns the match plus the surrounding tags which is unwanted

我尝试过多个函数来提取两个字符串之间的任何内容,分隔符可能包含特殊字符,我猜这就是为什么 没有人对我有用。 p>

我当前的功能: p>

 函数($ str,$ startTag,$ endTag){
 $  delimiter ='#'; 
 $ regex = $ delimiter。  preg_quote($ startTag,$ delimiter)
。  '(。*?)'
。  preg_quote($ endTag,$ delimiter)
。  $ delimiter 
。  's'; 
 preg_match($ regex,$ str,$ matches); 
返回$ matches; 
} 
  code>  pre> 
 
 

字符串示例: p>

 #{ST @ RT}#
Text我想要
#{END}#
 
#{ST @ RT}#
我想要的第二个文字
#{  END}#
  code>  pre> 
 
 

如何改进或建议另一种解决方案: p>

  • 支持任何类型的 字符或新行 li>
  • 如果找到则提取多个字符串 li> ul>

    当前行为 : strong>仅返回第一个匹配项,并返回匹配项以及不需要的周围标记 em> p> div>

Use the m option for multi-line regular expressions (it allows the . character to match newlines):

preg_match('/foo.+bar/m', $str);
//                    ^--- this

Use preg_match_all() to get your multiple strings:

preg_match_all($regex, $str, $matches);
return $matches[1]; // an array of the strings

Edit:

The reason your current code returns the match plus the surrounding tags is because you're using return $matches. The $matches array has several elements in it. Index 0 is always the entire string that matched the expression. Indexes 1 and higher are your capture groups. In your expression, you had only one capture group (the "string"), so you would have wanted to only do return $matches[1] instead of return $matches.

You can use preg_match_all to extract multiple strings, besides that your code seems simple enough, normally simpler is faster.