从 php 字符串中删除所有标点符号以获得友好的 seo url

从 php 字符串中删除所有标点符号以获得友好的 seo url

问题描述:

所以,我已经看到了大量的解决方案"在这个网站上,但似乎没有一个完全适合我.我想从帖子名称中去除所有标点符号,以便系统可以为每个帖子动态创建网址.我找到了 David Walsh 的一篇文章,其中提供了有关如何实现这一目标的分步教程.然而,并不是所有的东西都会被剥离.这是文章的链接(以防万一):http://davidwalsh.name/php-seo一>.

So, I've seen a ton of "solutions" on this site, but none of them seem to work fully for me. I would like to strip all punctuation from a post name so that the system can dynamically create urls for each post. I found an article by David Walsh that provides a step by step tutorial on how this can be achieved. However, not everything gets stripped. Here is a link to the article (just in case): http://davidwalsh.name/php-seo.

这是我修改后删除所有标点符号的代码:

Here's the code I've altered to remove all punctuation:

$return = trim(preg_replace('/[^a-z0-9]+/i'," ", strtolower($post_name)));

这是一个示例帖子名称:Testing's, this &更多!

Here's an example post name: Testing's, this & more!

回显 url 时的结果:testing-039-s-this-amp-more.php

Results when I echo the url: testing-039-s-this-amp-more.php

我不知道为什么要保留符号和单引号的 html 代码.有什么想法吗?!?

I'm not sure why it's keeping the html code for the ampersand and the single quote. Any ideas?!?

看起来数据是通过 htmlspecialchars()htmlentities() 某处运行的.首先使用 htmlspecialchars_decode()html_entity_decode() 撤销:

Looks like the data is run through htmlspecialchars() or htmlentities() somewhere. Undo that with htmlspecialchars_decode() or html_entity_decode() first:

$return = trim(preg_replace('/[^a-z0-9]+/i'," ", strtolower(htmlspecialchars_decode($post_name))));