如何在php中截断字符串的一部分

如何在php中截断字符串的一部分

问题描述:

for example i have a text like

<p>
Quis vel accusantium libero. Suscipit officiis culpa
<img src="data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7">
libero quia ad.
</p>

and i want to check if the string has any data:image then truncate only this part so max char of 50, so the results become

<p>
Quis vel accusantium libero. Suscipit officiis culpa
<img src="data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH...">
libero quia ad.
</p>

am not sure how exactly to achieve that with preg_replace and "data:image.+?" pattern

例如我有一个类似 p>的文本

 &lt; p&gt; 
Quis vel accusantium libero。  Suscipit officiis culpa 
&lt; img src =“data:image / gif; base64,R0lGODlhAQABAIAAAAAAAP /// yH5BAEAAAAALAAAAAABAAEAAAIBRAA7”&gt; 
libero quia ad。
&lt; / p&gt; 
  code>  pre> 
 
  n 

我想检查字符串是否有任何 data:image code>然后只截断这部分,因此max char为50,所以结果变为 p>

 &lt; p&gt; 
Quis vel accusantium libero。  Suscipit officiis culpa 
&lt; img src =“data:image / gif; base64,R0lGODlhAQABAIAAAAAAAP /// yH ...”&gt; 
libero quia ad。
&lt; / p&gt; 
  code>  pre>  
 
 

我不确定如何使用 preg_replace code>和“data:image。+?” code> pattern p> div来实现这一目标。 >

You can do that in different ways, with preg_match(_all), preg_split, etc.

But with the preg_replace will work like this: run to see

<?php
$text='data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7';
$result=preg_replace('/(?<=data:image.{50}).*/', '', $text);

echo $result;

Problem: PHP string parse

  • Thanks for clarifying your question with comments. What you seem to be wanting is a general-purpose HTML parser that can make special-case modifications to the HTML Markup.
  • Generally speaking, it is not advisable to use regex to parse HTML.
  • If you are wanting a general-purpose tool (and not a quick-and-dirty approach) SO already has a question about Modifying html attributes with PHP that may be closer to what you want.
  • If all you want is a quick-and-dirty approach that will remove long base64 encoded data from src attribute on img tags, then you can tokenize the raw HTML string, and then perform regex replaces, but that approach is going to be painful if you decide you want to do other modifications. You may end up re-inventing the wheel, when you could have just used a real HTML parser to begin with.
  • Nevertheless, the below approach does just that, tokenize the string, do replacements and then return the entire modified string.

Solution using preg_replace (quick-and-dirty)

<?php

$demostring = '
<p>
Quis vel accusantium libero. Suscipit officiis culpa
<img src="data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7">
libero quia ad.
</p>
';

function ctf0_truncate($vinput){
  return( preg_replace('/(data:image.{50})(.*)/', '$1', $vinput) );
}

function ctf0_parse($text, $chars = 50) {
  if (strpos($text, 'data:image') !== FALSE){
    $tokens = explode('"',$text);
    $tokens = array_map("ctf0_truncate",$tokens);
    $vout   = implode('"',$tokens);
  } elseif( True ) {
    $vout = $text;
  }
  return $vout;
}

$myresult = ctf0_parse($demostring);
print($myresult);

Output result

<p>
Quis vel accusantium libero. Suscipit officiis culpa
<img src="data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALA">
libero quia ad.
</p>

Notes

  • The above solution omits a requested element of the question. Specifically, how to add the '...' ellipsis points. For that part, please see other answers on SO, such as here and here.