基于来自其他列的条件,将数值替换为NA:
我是新的data.table包,请execuse我的简单的问题。我有一个看起来像DT
I am new to data.table package, please execuse my simple question. I have a data set that looks like DT
DT <- data.table(a = sample(c("C","M","Y","K"), 100, rep=TRUE),
b = sample(c("A","S"), 100, rep=TRUE),
f = round(rnorm(n=100, mean=.90, sd=.08),digits = 2) ); DT
如果满足某个条件,我想用NA替换列f中的任何值。例如对于 0.85> f。 0.90
我会有以下条件:
I would like to replace any value in column f with NA if it meets a certain condition. For example for 0.85 > f > 0.90
I would have the following condition:
DT$a == "C" & DT$b == "S" & DT$f < .85| DT$a == "C" & DT$b == "S" & DT$f >.90
我还想为每个分类
I would also like to have a different condition for each of the categorical variables in columns a and b.
使用您提到的条件,但没有 DT $
会将满足条件的条目的 data.table
子集,然后可以使用 j
:= 运算符,通过引用 f
也就是说,
Using the condition you've stated, but without the DT$
will subset your data.table
for those entries that satisfy the condition, then you can use the j
field to assign NA value to f
by reference using :=
operator. That is,
DT[a == "C" & b == "S" & f < .85 | a == "C" & b == "S" & f >.90, f := NA]
which(is.na(DT$f))
# [1] 3 16 31 89
编辑:OP的注释和@ Joshua的好建议:
after OP's comment and @Joshua's nice suggestion:
`%between%` <- function(x, vals) { x >= vals[1] & x <= vals[2]}
`%nbetween%` <- Negate(`%between%`)
DT[a %in% c("C", "M", "Y", "K") & b == "S" & f %nbetween% c(0.85, 0.90), f := NA]
之间的
%的否定将给出期望的结果(f 0.90)。还要注意使用
%in%
来检查 a
%nbetween%
which is the negation of the %between%
will give the desired result (f < 0.85 and f > 0.90). Also note the use of %in%
to check for multiple values of a
编辑2:在OP完全重写之后,恐怕没有太多可以做的,除了b ==A,b ==S。
Edit 2: Following OP's complete re-write, I'm afraid there's not much you can do, except group b == "A", b == "S".
`%nbetween%` <- Negate(`%between%`)
DT[a == "M" & b %in% c("A", "S") & f %nbetween% c(.85, .90), f := NA]
DT[a == "Y" & b %in% c("A", "S") & f %nbetween% c(.95, .90), f := NA]
DT[a == "K" & b %in% c("A", "S") & f %nbetween% c(.95, 1.10), f := NA]