用不同的替换顺序替换字符串中匹配单个模式的多个位置
使用 stringr
包,很容易以矢量化的方式执行正则表达式替换.
Using stringr
package, it is easy to perform regex replacement in a vectorized manner.
问题:我该怎么做:
替换所有单词
hello,world??your,make|[]world,hello,pos
到不同的替换,例如数量增加
to different replacements, e.g. increasing numbers
1,2??3,4|[]5,6,7
注意不能假设简单的分隔符,实际用例更复杂.
Note that simple separators cannot be assumed, the practical use case is more complicated.
stringr::str_replace_all
似乎不起作用,因为它
stringr::str_replace_all
does not seem to work because it
str_replace_all(x, "(\\w+)", 1:7)
为应用于所有单词的每个替换生成一个向量,或者它有不确定和/或重复的输入条目,以便
produces a vector for each replacement applied to all words, or it has uncertain and/or duplicate input entries so that
str_replace_all(x, c("hello" = "1", "world" = "2", ...))
不会达到目的.
这是使用 gsubfn
的另一个想法.pre
函数在替换之前运行,fun
函数在每次替换之前运行:
Here's another idea using gsubfn
. The pre
function is run before the substitutions and the fun
function is run for each substitution:
library(gsubfn)
x <- "hello,world??your,make|[]world,hello,pos"
p <- proto(pre = function(t) t$v <- 0, # replace all matches by 0
fun = function(t, x) t$v <- v + 1) # increment 1
gsubfn("\\w+", p, x)
给出:
[1] "1,2??3,4|[]5,6,7"
这个变体会给出相同的答案,因为 gsubfn 维护了一个 count
变量用于 proto 函数:
This variation would give the same answer since gsubfn maintains a count
variable for use in proto functions:
pp <- proto(fun = function(...) count)
gsubfn("\\w+", pp, x)
有关使用 count
的示例,请参阅 gsubfn 插图.
See the gsubfn vignette for examples of using count
.