在多个字符串中查找常见文本出现

在多个字符串中查找常见文本出现

问题描述:

我试图找出一种方法来比较两个字符串并返回它们的常用"词,因为字符串总是小写,我想为此创建一个函数..例如

I was trying to figure out a way to compare two strings and returns their 'common' words, given that the strings are always in lowercase, I wanted to create a function for this..for example

str1 = "this is a test"
str2 = "saldkasl test asdasd"

result = stringcompare(str1, str2) 'returns "test"

两个字符串之间的常用词应该是test"并且如果两个字符串有两个或两个以上的常用词,函数应该连接字符串

the common word between the two strings should be "test" and if, the two strings have two or more common words, the function should concatenate the strings

str1 = "this is another test"
str2 = "another asdsada test asdsa"
result = stringcompare(str1, str2) ' returns "another test"

我找到了一个有用的链接,它给了我一个想法但是不知何故真的缺少一些东西

i have found a useful link, it gave me an idea but somehow something is really lacking

我现在正在做的伪代码是这样的,

A pseudocode of what I'm doing right now is this,

**

'1st: separate the words by every space, " ", then store it in an array or list 
'2nd: compare each item on the list, if equal then store to variable 'result'

**

这样好吗?我认为它很慢,也许有人对此有更好的方法..谢谢

is this okay? I think it is slow and maybe there is someone out there that has a better approach on this..thanks

在 VS 2013 中测量,以下解决方案比 Guffa 的解决方案平均快 20%:

As measured in VS 2013, below solution is on average 20% faster than Guffa's:

Dim str1 As String = "this is another test"
Dim str2 As String = "another asdsada test asdsa"
Dim result As String = String.Join(" ", str1.Split(" "c).
                              Intersect(str2.Split(" "c)))

通过将每个解决方案循环 100000 次并使用 StopWatch 测量时间获得结果.

Results were obtained by looping each solution 100000 times and measuring time with StopWatch.