将总和行拆分为 R 中的各个行
我有一组数据,其中我将特定值的年度总计存储在一行中(观察).我想整理 R 中的数据,以便使用一个简单的等式 (total/12) 将每个月的总行分开,该等式将年度总计除以 12 行中的每一行的 12 作为每月总计.我正在尝试在 R 中执行此操作,但我是初学者并且不太确定从哪里开始.示例如下:
I have a set of data where I have annual totals for specific values stored in one row (observation). I'd like to tidy the data in R so that this total row is broken out for each month using a simple equation (total/12) that stores the annual total divided by 12 in each of 12 rows as a monthly total. I'm trying to do this in R but am very beginner and not quite sure where to start. Example is below:
Date | Total
2015 | 12,000
Some R function to convert to:
Date | Total
01-01-2015 | 1,000
02-01-2015 | 1,000
03-01-2015 | 1,000
以下是一个包含多年的示例数据集,我认为这是您想要的用例:
Here is an example dataset with multiple years, which I presume is your desired use case:
myData <-
data.frame(
Date = 2011:2015
, Total = (1:5)*1200
)
然后,我按 Date
列拆分 data.frame(假设年份是唯一的),使用 lapply
循环修改为您的每月值想要,然后将它们与 dplyr
中的 bind_rows
缝合在一起(还需要为管道加载 dplyr
或 magrittr
(%>%
) 工作).
Then, I split the data.frame up by the Date
column (this assumes the years are unique), looped through with lapply
to modify to the monthly values you wanted, then stitched them back together with bind_rows
from dplyr
(also need to load dplyr
or magrittr
for the pipes (%>%
) to work).
myData %>%
split(.$Date) %>%
lapply(function(x){
temp <- data.frame(
Date = as.Date(paste(x$Date
, formatC(1:12, width = 2, flag = "0")
, "01"
, sep = "-"))
, Total = x$Total / 12
)
}) %>%
bind_rows()
给出(注意,使用 ...
表示跳过的行)
Gives (note, using ...
to indicate skipped rows)
Date Total
1 2011-01-01 100
2 2011-02-01 100
3 2011-03-01 100
...
16 2012-04-01 200
17 2012-05-01 200
18 2012-06-01 200
...
29 2013-05-01 300
30 2013-06-01 300
31 2013-07-01 300
...
43 2014-07-01 400
44 2014-08-01 400
45 2014-09-01 400
...
58 2015-10-01 500
59 2015-11-01 500
60 2015-12-01 500
如果按照评论中的建议,您需要每年除以不同的月份数,我将创建一个以年份命名的除数向量.在这里,我使用 dplyr
作为 n_distinct
和 magrittr
管道来简化常见情况的命名.然后,添加/更改不同年份的条目:
If, as suggested in the comments, you need to divide by a different number of months each year, I would create a vector of the divisors named for the years. Here, I am using dplyr
for n_distinct
and the magrittr
pipe to ease naming of the common case. Then, add/change the entry for the different year:
toDivide <-
rep(12, n_distinct(myData$Date)) %>%
setNames(sort(unique(myData$Date)))
toDivide["2016"] <- 9
然后,您可以使用此向量代替上面的 12
.所以,替换这个:
Then, instead of 12
as above, you can use this vector. So, replace this:
Total = x$Total / 12
在 lapply
循环中,用这个:
within the lapply
loop, with this:
Total = x$Total / toDivide[as.character(x$Date)]
它应该除以那一年的条目,而不是总是除以相同的东西.
and it should divide by the entry for that year, instead of always dividing by the same thing.