计算R中数据集的多重方差
问题描述:
我的问题与与此有关问题.
我的数据如下
V1 V2
.. 1
.. 2
.. 1
.. 3
我需要为V2
的每个值累计计算V1
中数据的方差(这意味着对于V2
的特定值,例如n
,V1
的所有行都具有对应的n
.
I need to calculate variance of data in V1
for each value of V2
cumulatively (This means that for a particular value of V2
say n
,all the rows of V1
having corresponding V2
less than n
need to be included.
在这种情况下ddply
会提供帮助吗?
Will ddply
help in such a case?
答
我不认为ddply
会有所帮助,因为它基于获取数据的不重叠子集的概念框架.
I don't think ddply
will help since it is built on the concept of taking non-overlapping subsets of a data frame.
d <- data.frame(V1=runif(1000),V2=sample(1:10,size=1000,replace=TRUE))
u <- sort(unique(d$V2))
ans <- sapply(u,function(x) {
with(d,var(V1[V2<=x]))
})
names(ans) <- u
我不知道是否有更有效的方法来完成此操作...
I don't know if there's a more efficient way to do this ...