有条件地用R计算列中的值数

问题描述：

我有两个向量：

x <- c(1,1,1,1,1, 2,2,2,3,3,  3,3,3,4,4,  5,5,5,5,5 )
y <- c(2,2,1,3,2, 1,4,2,2,NA, 3,3,3,4,NA, 1,4,4,2,NA)

此问题（有条件的用R，part2 计算列中的值数）讨论了如何找到 w 中的值数（不计算 NA ），每个 x （从1-5开始）和每个 y （从1-4）。

This question (Conditional calculating the numbers of values in column with R, part2) discussed how to find the number of values in w (don't count NA) for each x (from 1–5) and for each y (from 1–4).

让我们按组划分 X ：如果 x< = 2 ，组 I ；如果 2< x< = 3 ，则将 II 分组；并且如果 3< X< = 5 ，则将 III 分组。我需要在 x 中按组以及每个 y 的值找到不同值的数量。我还需要在同一组的 x 中找到这些值的平均值。输出应采用以下格式：

Let's split X by groups: if x<=2, group I; if 2<x<=3, group II; and if 3<X<=5, group III. I need to find the number of different values in x by groups and by every value of y. I also need to find the mean of those values in x by the same groups. The output should be in this format:

y x    Result 1 (the number of distinct numbers in X); Result 2 (the mean)
1 I     ...
1 II    ...
1 III   ...     
...
4 I     ...
4 II    ...
4 III   ...

答

#Bring in data.table library
require(data.table)
data <- data.table(x,y)

#Summarize data
data[, list(x = mean(x, na.rm=TRUE)), by = 
       list(y, x.grp = cut(x, c(-Inf,2,3,5,Inf)))][order(y,x.grp)]

如果您希望当NA $时结果为 NA c $ c>存在，然后从 mean（。）中删除 na.rm = TRUE ： p>

If you'd like the results to be NA when NAs are present, then just remove na.rm=TRUE from mean(.):

data[, list(x = mean(x)), by = 
       list(y, x.grp = cut(x, c(-Inf,2,3,5,Inf)))][order(y,x.grp)]

有条件地用R计算列中的值数

相关推荐