matlab正则表达式:单词以空格"\< \ s.* \ s \>"开头和结尾
在matlab中,通过使用'\< \ s.* \ s \>'
In matlab, to find words starting and ending both with space by using '\<\s.*\s\>'
命令:
str = 'A body or collection of such stories s@@5%%suchstro end';
regexp(str, '\<\s.*\s\>', 'match')
结果不返回任何内容.
但是,八度中的相同命令会返回:'正文或此类故事的集合s @@ 5 %% suchstro'
However, same commands in Octave, returns: ' body or collection of such stories s@@5%%suchstro '
'\<\s.*?\s\>'
也可以在Octave中使用,但不能在matlab中使用.
'\<\s.*?\s\>'
also works in Octave, but not in matlab.
有什么想法吗?谢谢.
\<\s.*?\s\>
读取为:单词开头,空格,任何内容,空格,单词结尾.但是单词不能以空格开头,因此该模式不匹配任何内容.
\<\s.*?\s\>
reads as: beginning of word, whitespace, anything, whitespace, end of word. But a word cannot begin with whitespace, so this pattern does not match anything.
模式\s\<.*?\>\s
返回
` body or collection of such stories s@@5%%suchstro `
这可能不是您想要的.这不是单词的集合,而是所有单词的集合,因为匹配是贪婪的.变得懒惰:
which is probably not what you wanted. This is not a collection of words, but everything together, because the match is greedy. Make it lazy:
regexp(str, '\s\<?.*?\>\s', 'match')
' body ' ' collection ' ' such ' ' s@@5%%suchstro '
而且,您不想捕获这些空间,是吗?对他们使用前瞻性和后向性:
Also, you don't want to capture those spaces, do you? Use lookahead and lookbehind for them:
regexp(str, '(?<=\s)\<?.*?\>(?=\s)', 'match')
'body' 'or' 'collection' 'of' 'such' 'stories' 's@@5%%suchstro'
最后... s @@ 5 %% suchstro可能不是一个字,是吗?也许您需要\w
文字字符代替\.
Finally... s@@5%%suchstro is probably not a word, is it? Maybe you need \w
, word characters, in place of \.
regexp(str, '(?<=\s)\<?\w*?\>(?=\s)', 'match')
'body' 'or' 'collection' 'of' 'such' 'stories'
在这种形式下,懒惰/贪婪的区别不再是问题,因此可以将表达式简化为(?<=\s)\<\w*\>(?=\s)
甚至是(?<=\s)\w*(?=\s)
,因为空格提供了单词边界.
In this form, the lazy/greedy distinction is no longer an issue, so the expression can be simplified to (?<=\s)\<\w*\>(?=\s)
or even to (?<=\s)\w*(?=\s)
since spaces provide word boundaries.