如何在R中使用gsub进行精确的字符串匹配?

问题描述:

raw = c("MOUNTAIN VIEW","MOUNTAIN")
x = gsub("MOUNTAIN", "MOUNTAIN VIEW", raw, ignore.case = TRUE)

Current output: "MOUNTAIN VIEW VIEW" "MOUNTAIN VIEW"  
Desired output:  "MOUNTAIN VIEW" "MOUNTAIN VIEW"  

我只想用 MOUNTAIN VIEW 替换原始数据 MOUNTAIN 中的第二个条目.原始数据中的第一个条目已经正确.但是,当我执行 gsub 时,它会将 MOUNTAIN 的所有出现都替换为 MOUNTAIN VIEW .谁能帮助我找到解决这个问题的方法?

I only want to replace the 2nd entry in the raw data MOUNTAIN with MOUNTAIN VIEW. The first entry in raw data is already correct. But when I do gsub it replaces both the occurrences of MOUNTAIN with MOUNTAIN VIEW. Can anyone help me find a way to get around that?

我尝试了 \\ b ,但是它不起作用,我知道为什么.我还能做些什么吗?

I tried \\b but it didn't work and I understand why. Is there any thing else I can do?

使用 锚点 ,而不是匹配整个字符串:

Use anchors instead here to match the entire string:

sub('^MOUNTAIN$', 'MOUNTAIN VIEW', raw, ignore.case = TRUE)
# [1] "MOUNTAIN VIEW" "MOUNTAIN VIEW"

如果需要,您还可以使用捕获组并在替换调用中对其进行反向引用:

If you desire, you can also use a capturing group and backreference it inside the replacement call:

sub('^(MOUNTAIN)$', '\\1 VIEW', raw, ignore.case = TRUE)