Sphinx(或其他第三方)搜索引擎是否适用于我的情况,还是应该创建自己的搜索引擎?

问题描述:

I am creating a search function of my classifieds on my website. Here is some of the criteria I need to meet:

  • When searching for 'bmw 520' only matches where these two words come in exactly this order is returned. not matches for only 'bmw' or only '520'.

  • When searching for 'bmw 330ci' results as the above will be returned, but, WITH AND WITHOUT the ci extension. There are a nr of extensions in cars as you all know (i, ci, si, fi etc).

  • I want the 'minus sign' to 'exclude' all returns containing the word after the sign, ex: 'bmw -330' will return all 'bmw' results without the '330' ones. (a NOT instead of minus sign is also ok)

  • all special character accents like 'é' are converted to their simple values, in this case 'e'.

  • list of words to ignore completely in the search string.

Would I need Sphinx or should I write this in a php file?

What do you suggest I do?

Thanks

我正在我的网站上创建我的分类广告的搜索功能。 这是我需要满足的一些标准 : p>

  • 搜索“bmw 520”时,仅返回这两个单词恰好按此顺序匹配的位置。 不仅仅匹配'bmw'或仅匹配'520'。 p> li>

  • 当搜索'bmw 330ci'时,将返回上面的结果,但是,WITH WITH WITHOUT ci扩展。 你们都知道(i,ci,si,fi等)汽车中有一些扩展。 p> li>

  • 我想要'减号'来'排除' 包含该符号后面的单词的所有返回,例如:'bmw -330'将返回所有'bmw'结果而不包含'330'结果。 (一个NOT而不是减号也可以) p> li>

  • 所有特殊字符重音如“é”都会被转换为它们的简单值,在本例中为“e”。 / p> li>

  • 搜索字符串中要完全忽略的单词列表。 p> li> ul>

    我需要吗? Sphinx还是我应该在php文件中写这个? p>

    你建议我做什么? p>

    谢谢 p> div >

I think that Sphinx matches all of your criteria.

I think Sphinx is pretty good match to what you want to do, but some things won't happen automatically...

  • To match on two words together exactly, you either need to use the phrase match mode, or group the words in double-quotes while using the extended match mode.

  • This is the tricky one - unless you specify specific exceptions, I don't think you can index 330ci as both '330 ci' and '330ci'.

  • As long as you're using boolean or extended match modes, then the minus sign works as you'd like.

  • 'Special' characters can be converted to standard ASCII, but this doesn't happen by default. You need to set up your charset_table value. This blog post is aimed at Thinking Sphinx (a Ruby plugin for Sphinx), but the setting value is just passed straight through to Sphinx.

  • You can only ignore specific words on a per-query basis if you've got at least one other word in the query (that is: "-foo" will fail for Sphinx, but "foo -bar" is fine). It's worth noting that you can choose to not index specific words.