有条件地用R计算列中的值数
我有两个向量:
x <- c(1,1,1,1,1, 2,2,2,3,3, 3,3,3,4,4, 5,5,5,5,5 )
y <- c(2,2,1,3,2, 1,4,2,2,NA, 3,3,3,4,NA, 1,4,4,2,NA)
此问题(有条件的用R,part2 计算列中的值数)讨论了如何找到 w
中的值数(不计算 NA
),每个 x
(从1-5开始)和每个 y
(从1-4)。
This question (Conditional calculating the numbers of values in column with R, part2) discussed how to find the number of values in w
(don't count NA
) for each x
(from 1–5) and for each y
(from 1–4).
让我们按组划分 X
:如果 x< = 2
,组 I
;如果 2< x< = 3
,则将 II
分组;并且如果 3< X< = 5
,则将 III
分组。我需要在 x
中按组以及每个 y
的值找到不同值的数量。我还需要在同一组的 x
中找到这些值的平均值。输出应采用以下格式:
Let's split X
by groups: if x<=2
, group I
; if 2<x<=3
, group II
; and if 3<X<=5
, group III
. I need to find the number of different values in x
by groups and by every value of y
. I also need to find the mean of those values in x
by the same groups. The output should be in this format:
y x Result 1 (the number of distinct numbers in X); Result 2 (the mean)
1 I ...
1 II ...
1 III ...
...
4 I ...
4 II ...
4 III ...
#Bring in data.table library
require(data.table)
data <- data.table(x,y)
#Summarize data
data[, list(x = mean(x, na.rm=TRUE)), by =
list(y, x.grp = cut(x, c(-Inf,2,3,5,Inf)))][order(y,x.grp)]
如果您希望当 NA $时结果为
NA
c $ c>存在,然后从 mean(。)
中删除 na.rm = TRUE
: p>
If you'd like the results to be NA
when NA
s are present, then just remove na.rm=TRUE
from mean(.)
:
data[, list(x = mean(x)), by =
list(y, x.grp = cut(x, c(-Inf,2,3,5,Inf)))][order(y,x.grp)]