如何在两个字符串中找到相似单词的数量?

如何在两个字符串中找到相似单词的数量?

问题描述:

I have two stings:

$var_x = "Depending structure";
$var_y = “Depending on the structure of your array ";

Can you please tell me how can I found out, how many words in var_x is in var_y? In order to do that, I did the following:

$pieces1 = explode(" ", $var_x);
$pieces2 = explode(" ", $var_y);
$result=array_intersect($pieces1, $pieces2);
//Print result here?

But this didn't show many how many of var_x words are in var_y

我有两个叮咬: p>

  $ var_x =“依赖 结构“; 
 $ var_y =”取决于数组的结构“; 
  code>  pre> 
 
 

你能告诉我怎样才能找到,有多少单词在 var_x在var_y? 为了做到这一点,我做了以下内容: p>

  $ pieces1 = explode(“”,$ var_x); 
 $ pieces2 = explode(  “”,$ var_y); 
 $ result = array_intersect($ pieces1,$ pieces2); 
 //在此打印结果?
  code>  pre> 
 
 

但这没有' t显示var_y p> div>中有多少var_x字

Using explode() to split the given string to words is wrong. World is not perfect and you can't make sure each word will be separated with a space.

See the following lines:

  • "This is a test sentence" - 5 words from explode()
  • "This is a test sentence. Not a word." - 8 words, you will get "sentence." as a word.
  "This is a test

sentence"

- 4 words from explode, "test sentence" is a single word.

Above examples are just to show that using explode() is plain wrong. Use str_word_count()

$var_x = "Depending structure";
$var_y = "Depending on the structure of your array ";
$pieces1 = str_word_count($var_x, 1);
$pieces2 = str_word_count($var_y, 1);
$result=array_intersect(array_unique($pieces1), array_unique($pieces2));
print count($result);

This will (int) 2, and you will see that your explode() method returns the same value. But in different and complex cases, above method will give correct word count (Also note the array_unique() use)