R根据行中的值在列中重复

问题描述:

我有一个如下所示的数据框:

I have a dataframe like the following:

Name    School   Weight Days
Antoine Bach     0.03   5
Antoine Ken      0.02   7
Barbara Franklin 0.04   3

我想获得如下输出:

Name    School   1    2    3    4    5    6    7
Antoine Bach     0.03 0.03 0.03 0.03 0.03 NA   NA
Antoine Ken      0.02 0.02 0.02 0.02 0.02 0.02 0.02
Barbara Franklin 0.04 0.04 0.04 NA   NA   NA   NA

可重现的样本数据:

df <- tribble(
  ~Name,    ~School,   ~Weight, ~Days,
  "Antoine", "Bach",     0.03,   5,
  "Antoine", "Ken",      0.02,   7,
  "Barbara", "Franklin", 0.04,   3
)

使用 data.table 你可以通过rep吃掉Weight来创建一个长版本Days 每行的次数,然后 dcast 转换为宽格式,以新变量的 rowid 作为列.

Using data.table you can create a long version by repeating the Weight value Days number of times for each row, then dcasting to a wide format with the rowidof the new variable as the column.

library(data.table)
setDT(df)

dcast(df[, .(rep(Weight, Days)), .(Name, School)], 
      Name + School ~ rowid(V1))

# Name   School    1    2    3    4    5    6    7
# 1: Antoine     Bach 0.03 0.03 0.03 0.03 0.03   NA   NA
# 2: Antoine      Ken 0.02 0.02 0.02 0.02 0.02 0.02 0.02
# 3: Barbara Franklin 0.04 0.04 0.04   NA   NA   NA   NA

您也可以rep Weight Days 的数量,然后rep NA 足够的次数来完成该行.

You could also rep Weight the number of Days, then rep NA enough times to complete the row.

max_days <- max(df$Days) 

df[, as.list(rep(c(Weight, NA), c(Days, max_days - Days))), 
   .(Name, School)]

# Name   School   V1   V2   V3   V4   V5   V6   V7
# 1: Antoine     Bach 0.03 0.03 0.03 0.03 0.03   NA   NA
# 2: Antoine      Ken 0.02 0.02 0.02 0.02 0.02 0.02 0.02
# 3: Barbara Franklin 0.04 0.04 0.04   NA   NA   NA   NA