使用两个标准对 R 中的数据框进行子集化,其中之一是正则表达式
我有一个类似这样的数据集:
I have a dataset something like this:
col_a col_b col_c
1 abc_boy 1
2 abc_boy 2
1 abc_girl 1
2 abc_girl 2
我只需要根据col_b
和col_c
取第一行,然后把col_c
中的valye改一下就行了像这样:
I need to pick up the first row only based on col_b
and col_c
, and then change the valye in col_c
, which is something like this:
df[grep("_boy$",df[,"col_b"]) &df[,"col_c"]=="1","col_c"] <- "是"
但是上面的代码是不行的,因为第一个条件和第二个条件不是来自同一个集合.
But the code above is not OK, since the first criteria and the second criteria do not originate from the same set.
我可以通过使用显式循环以愚蠢的方式完成它,或者执行两层"子集,如下所示:
I can do it in a dumb way by using a explicit loop, or do a "two-tier" subsetting, something like this:
df.a <- df[grep("_boy$",df[,"col_b"]),] #1
df.b <- df[grep("_boy$",df[,"col_b"],invert=TRUE),] #2
df.a <- df.a[df.a[,"col_c"]=="1","col_c"] <- "yes" #3
df.a <- df.a[df.a[,"col_c"]=="2","col_c"] <- "no" #4
df <- rbind(df.a,df.b) #5
但我不想这样做,谁能告诉我如何合并"#1
和#3
?谢谢.
But I prefer not to, can anyone enlighten me how to "merge" #1
and #3
? Thanks.
尝试 grepl
而不是 grep
.grepl
返回一个逻辑向量(x 的每个元素是否匹配),可以与逻辑运算符组合.
Try grepl
instead of grep
.
grepl
returns a logical vector (match or not for each element of x), which can be combined with logical operators.