如何对除一列以外的所有列进行分组?
如何告诉 group_by
将数据按给定列以外的所有列分组?
How do I tell group_by
to group the data by all columns except a given one?
使用 aggregate
,它将是 aggregate(x〜。,...)
。
我尝试了 group_by(data,-x)
,但是按x的负数分组(即与按x分组相同)。 / p>
I tried group_by(data, -x)
, but that groups by the negative-of-x (i.e. the same as grouping by x).
您可以使用标准评估( group_by _
代替 group_by
):
You can do this using standard evaluation (group_by_
instead of group_by
):
# Fake data
set.seed(492)
dat = data.frame(value=rnorm(1000), g1=sample(LETTERS,1000,replace=TRUE),
g2=sample(letters,1000,replace=TRUE), g3=sample(1:10, replace=TRUE),
other=sample(c("red","green","black"),1000,replace=TRUE))
dat %>% group_by_(.dots=names(dat)[-grep("value", names(dat))]) %>%
summarise(meanValue=mean(value))
g1 g2 g3 other meanValue
<fctr> <fctr> <int> <fctr> <dbl>
1 A a 2 green 0.89281475
2 A b 2 red -0.03558775
3 A b 5 black -1.79184218
4 A c 10 black 0.17518610
5 A e 5 black 0.25830392
...
请参见此插图,以了解有关标准与非标准评估的更多信息 dplyr
。
See this vignette for more on standard vs. non-standard evaluation in dplyr
.
解决@ÖmerAn的评论:看来 group_by_at
是进入 dplyr的方式
0.7.0(如果我对此有误,请纠正我)。例如:
To address @ÖmerAn's comment: It looks like group_by_at
is the way to go in dplyr
0.7.0 (someone please correct me if I'm wrong about this). For example:
dat %>%
group_by_at(setdiff(names(dat), "value")) %>%
summarise(meanValue=mean(value))
# Groups: g1, g2, g3 [?]
g1 g2 g3 other meanValue
<fctr> <fctr> <int> <fctr> <dbl>
1 A a 2 green 0.89281475
2 A b 2 red -0.03558775
3 A b 5 black -1.79184218
4 A c 10 black 0.17518610
5 A e 5 black 0.25830392
6 A e 5 red -0.81879788
7 A e 7 green 0.30836054
8 A f 2 green 0.05537047
9 A g 1 black 1.00156405
10 A g 10 black 1.26884303
# ... with 949 more rows
让我们确认两种方法都给出相同的输出(在 dplyr
0.7.0中):
Let's confirm both methods give the same output (in dplyr
0.7.0):
new = dat %>%
group_by_at(setdiff(names(dat), "value")) %>%
summarise(meanValue=mean(value))
old = dat %>%
group_by_(.dots=names(dat)[-grep("value", names(dat))]) %>%
summarise(meanValue=mean(value))
identical(old, new)
# [1] TRUE