如何用特定列中的NA替换前n1行和最后n2行

问题描述:

给出以下示例:

library(data.table)
mat <- data.table(x = c(1:10), y = c(11:20), z = c(21:30))

cut.head <- c(0, 2, 1) 
cut.tail <- c(3, 1, 2) 

cut.head 表示每列从顶部开始的行数。

cut.head represents the number of rows that each column will be NA from top.

cut.tail

cut.tail represents the number of rows that each column will be NA from last.

例如,如果 cut.head ,第y列的第1行和第2行将是NAs,以及z的第1列

For example, if cut.head is used, 1st and 2nd rows of column y will be NAs, as well as the 1st column of z

我希望返回如下:

     x  y  z
 1:  1 NA NA
 2:  2 NA 22
 3:  3 13 23
 4:  4 14 24
 5:  5 15 25
 6:  6 16 26
 7:  7 17 27
 8: NA 18 28
 9: NA 19 NA
10: NA NA NA

谢谢

我用 := (或 set()),所以它的速度很快并且很容易阅读。

I'd just use a for loop with := (or set()) so it's fast and (fairly) easy to read.

> for (i in 1:3) mat[seq_len(cut.head[i]), (i):=NA]
> mat
     x  y  z
 1:  1 NA NA
 2:  2 NA 22
 3:  3 13 23
 4:  4 14 24
 5:  5 15 25
 6:  6 16 26
 7:  7 17 27
 8:  8 18 28
 9:  9 19 29
10: 10 20 30

请注意,:= 的LHS接受列号以及名称。另一方面,这是有效的:

Notice that the LHS of := accepts column numbers as well as names. As an aside, this is valid :

DT[, 2:=2]   # assign 2 to column 2

用括号括起:= 的LHS $ c>(i):= NA ,告诉它使用变量的值而不是它的名字。

Wrapping the LHS of := with parenthesis, (i):=NA, tells it to use the variable's value rather than its name.

以下但 .N i 中不可用。我已将其添加为功能请求, FR#724
UPDATE :现已于2014年7月11日添加到v1.9.3

For the tail I first tried the following but .N isn't available in i. I've added that as a feature request, FR#724.
UPDATE: Now added to v1.9.3 on 11 Jul 2014

for (i in 1:3) mat[.N+1-seq_len(cut.tail[i]), (i):=NA]
# .N now works in i
> mat
     x  y  z
 1:  1 NA NA
 2:  2 NA 22
 3:  3 13 23
 4:  4 14 24
 5:  5 15 25
 6:  6 16 26
 7:  7 17 27
 8: NA 18 28
 9: NA 19 NA
10: NA NA NA
>

我们不再需要重复使用符号 mat

We no longer have to live with a repetition of the symbol mat :

> for (i in 1:3) mat[nrow(mat)+1-seq_len(cut.tail[i]), (i):=NA]