PHP处理巨大的字符串

PHP处理巨大的字符串

问题描述:

I have to replace xmlns with ns in my incomming xml in order to fix SimpleXMLElements xpath() function. Most functions do not have a performance problem. But there allways seems to be an overhead as the string grows.

E.g. preg_replace on a 2 MB string takes 50ms to process, even if I limit the replaces to 1 and the replace is done at the very beginning.

If I substr the first few characters and just replace that part it is slightly faster. But not really that what I want.

Is there any PHP method that would perform better in my problem? And if there is no option, could a simple php extension help, that just does Replace => SimpleXMLElement in C?

我必须在中用 ns code>替换 xmlns code> xml以便修复SimpleXMLElements xpath()函数。 大多数功能没有性能问题。 但是随着字符串的增长,总是似乎是一个开销。 p>

例如。 2 MB字符串上的 preg_replace code>需要 50ms em>进行处理,即使我将替换限制为 1 code>并且替换在最开始时完成。 p>

如果我 substr code>前几个字符,只需更换那个部分就会稍快一点。 但实际上并不是我想要的。 p>

是否有任何PHP方法在我的问题中表现更好? 如果没有选项,可以使用简单的php扩展帮助,那就是替换=> C中的SimpleXMLElement? p> div>

If you know exactly where the offending "x", "m" and "l" are, you can just use something like $xml[$x_pos] = ' '; $xml[$m_pos] = ' '; $xml[$l_pos] = ' ' to transform them into spaces. Or transform them into ns___ (where _ = space).

You're always going to get an overhead when trying to do this - you're dealing with a char array and trying to do replace multiple matching elements of the array (i.e. words).

50ms is not much of an overhead, unless (as I suspect) you're trying to do this in a loop?

50ms sounds pretty reasonable to me, for something like this. The requirement itself smells of something being wrong.

Is there any particular reason that you're using regular expressions? Why do people keep jumping to the overkill regex solution?

There is a bog-standard string replace function called str_replace that may do what you want in a fraction of the time (though whether this is right for you depends on how complex your search/replace is).

From the PHP source, as we can see, for example here: http://svn.php.net/repository/php/php-src/branches/PHP_5_2/ext/standard/string.c

I don`t see, any copies, but I'm not expert in C. From the other hand we can see there many convert to string calls, which at 1st sight could copy values. If they copy values, then we in trouble here.

Only if we in trouble Try to invent some str_replace wheel here with the help of string-by-char processing. For example we have string $somestring = "somevalue". In PHP we could work with it's chars by indexes as echo $somestring{0}, which will give us "s" or echo $somestring{2} which will give us "m". I'm not sure in this way, but it's possible, if official implimentations don't use references, as they should use.