在MySQL / PHP中进行近似搜索匹配的最佳方法是什么?

在MySQL / PHP中进行近似搜索匹配的最佳方法是什么?

问题描述:

I'm looking for an effective way to do the following search in MySQL/PHP...

Imagine I have a number of fields in my DB I wish to search on:

User.username
Name.first_name
Address.line1
Phone.number
Email.email_address

I also have the following variables (with example data) in PHP to search with:

$username = "john123";
$name = "john";
$address = "10 fake street";
$phone = "23456789";
$email = "john@johnsemail.com";

Assuming there are 0 complete matches, how would I write a query which could see partial matches and then return results ordered by the number of matches?

For example, using my example data I'd might expect to see a result look something like this,

username | name | address        | phone    | email               | matches
----------------------------------------------------------------------------
john123  | john | 12 new street  | 23456789 | john@johnsemail.com | 4
tim123   | tim  | 10 fake street | 23456789 | tim@timsemail.com   | 2

Just to note, I'm not looking for a wild card search here. I want to return results which have exact matches, just not necessarily a complete match using all DB field. And also want to prioritize the results by the number of matches.

I can think of a very inefficient way of doing it by running each as a separate query, loading that into a PHP array then counting which IDs are found in the most arrays. However, the database running this has millions of records per table, so this wouldn't be feasible at all.

我正在寻找一种在MySQL / PHP中进行以下搜索的有效方法...... p>

想象一下,我希望搜索我的数据库中有多个字段: p>

  User.username 
Name.first_name 
Address.line1 
Phone  .nu​​mber 
Email.email_address 
  code>  pre> 
 
 

我还在PHP中使用以下变量(带有示例数据)来搜索: p>

  $ username =“john123”; 
 $ name =“john”; 
 $ address =“10 fake street”; 
 $ phone =“23456789”; 
 $ email =“john @ johnsemail。  com“; 
  code>  pre> 
 
 

假设有0个完整匹配,我如何编写一个可以看到部分匹配的查询,然后返回按匹配数排序的结果?

例如,使用我的示例数据我可能会看到结果看起来像这样, p>

  username | 名字| 地址| 电话| 电子邮件| 比赛的
 -----------------------------------------------  ----------------------------- 
john123 | 约翰|  12新街|  23456789 |  john@johnsemail.com |  4 
tim123 | 蒂姆|  10假街|  23456789 |  tim@timsemail.com |  2 
  code>  pre> 
 
 

请注意,我不是在寻找外卡搜索。 我想返回具有完全匹配的结果,但不一定是使用所有DB字段的完全匹配。 并且还希望按匹配数量对结果进行优先级排序。 p>

通过将每个作为单独的查询运行,将其加载到PHP数组中,我可以想到一种非常低效的方法 计算在大多数数组中找到哪些ID。 但是,运行它的数据库每个表有数百万条记录,所以这根本不可行。 p> div>

SELECT *, 
  if (username = '$username', 1, 0) +
  if (name     = '$name'    , 1, 0) +
  if (phone    = '$phone'   , 1, 0) +
  if (address  = '$address' , 1, 0) AS matches
FROM   tab
ORDER BY matches DESC