如何在c#.net中对lucene搜索引擎中的50 = 50进行处理

问题描述:

大家好!



我在用户搜索任何数字或字符串时遇到问题,然后同时对待它们。



假设 - 您想搜索任何名称为提高技能的10件事的书。因此,无论是用户输入10还是10,在两种情况下都显示相同的结果,其他书籍名称存在10或10,它们也会显示。



所以请给我你的宝贵回应尽快。



先谢谢!!



我尝试过:



我制作了一个将数字转换成单词的方法。

假设你输入50然后它返回Fifty但我没有' t有任何输入Fifty并返回50的逻辑。这背后没有任何逻辑。



所以我陷入了这一点。

Hi Guys!!

I am facing problem when user search any number or string then it treated both same.

Suppose - You want to search any book which name is "10 things to improve your skills". So either user enter 10 or Ten, in both cases the same results shown and other books which name exists 10 or Ten, they are also shown.

So please give me your valuable response ASAP.

Thanks in Advance!!

What I have tried:

I Make a method which converts Number Into Words.
Suppose You enter 50 then it returns Fifty but I didn't have any logic that enter Fifty and return 50. There are no any logic behind that.

So I am getting stuck on this point.

肯定你所需要的只是你已经拥有的相反功能 - 你需要存储单词



one通过九

十一到十九

十,二十,......九十

(可能是百,千,百万)



可能在字典中,带有整数值 - 然后它成为匹配输入文本并将数字的数值相加的问题,即


'五十一'==>五十 - > 50 + one - > 1 =返回int 51



我可以想到这个变种,但这很简单
surely all you need is the reverse function of what you have already - you need to store words for

"one" through "nine"
"eleven" through "nineteen"
"ten", "twenty", ... "ninety"
(and possibly "hundred", "thousand", "million")

possibly in a dictionary, with their integer value - then it becomes a matter of matching the input text and adding up the numeric values of the numbers ie

'fifty one' ==> fifty -> 50 + one -> 1 = return int 51

I can think of variants on this, but this is simple enough


你必须写一个自定义分析器,它将标题中的所有数字(50)转换为拼写形式(五十) - 或者相反(但我认为这会有点困难)。您必须使用此分析器进行索引,然后还要进行查询,以便转换后的标题在索引和搜索词中都相同。



Google for lucene自定义分析仪,你会发现大量的样本如何做到这一点。



但是,我认为这不会导致100%满意的解决方案因为可能有标题已经包含数字的拼写形式,但是当有人将数字作为数字输入时,分析器为查询生成的标题略有不同,例如101可以是一百零一或一百只或一百。



这就是为什么我建议教育你的用户只是跳过搜索中的数字完全一词。一本名为提高你的技能的10件事的书也将(应该)通过搜索提高你技能的东西来找到 - 然后还会出现一本名为提高你技能的101件事的书。用户最感兴趣。
You would have to write a custom analyzer which converts all numbers (50) in a title into the spelled form (fifty) - or the other way around (but I think that would be a bit more difficult). You would have to use this analyzer for indexing and then also for querying so that the converted titles are the same both in the index and in the search term.

Google for "lucene custom analyzer" and you will find plenty of samples on how to do this.

However, I don't think it would result in a 100% satisfactory solution because there could be titles already containing the spelled form of a number but slightly different to what your analyzer produces for querying when someone enters it as a number, e.g. 101 could be "one hundred and one" or "onehundredandone" or "hundredone".

That's why I would suggest to educate your users to just skip numbers from the search term altogether. A book titled "10 things to improve your skills" will (should) also be found by searching for "things to improve your skills" - and then also a book titled "101 things to improve your skills" will show up which probably is in the users best interest.