如何用特定列中的NA替换前n1行和最后n2行
给出以下示例:
library(data.table)
mat <- data.table(x = c(1:10), y = c(11:20), z = c(21:30))
cut.head <- c(0, 2, 1)
cut.tail <- c(3, 1, 2)
cut.head
表示每列从顶部开始的行数。
cut.head
represents the number of rows that each column will be NA from top.
cut.tail
cut.tail
represents the number of rows that each column will be NA from last.
例如,如果 cut.head
,第y列的第1行和第2行将是NAs,以及z的第1列
For example, if cut.head
is used, 1st and 2nd rows of column y will be NAs, as well as the 1st column of z
我希望返回如下:
x y z
1: 1 NA NA
2: 2 NA 22
3: 3 13 23
4: 4 14 24
5: 5 15 25
6: 6 16 26
7: 7 17 27
8: NA 18 28
9: NA 19 NA
10: NA NA NA
谢谢
我用 :=
(或 set()
),所以它的速度很快并且很容易阅读。
I'd just use a for
loop with :=
(or set()
) so it's fast and (fairly) easy to read.
> for (i in 1:3) mat[seq_len(cut.head[i]), (i):=NA]
> mat
x y z
1: 1 NA NA
2: 2 NA 22
3: 3 13 23
4: 4 14 24
5: 5 15 25
6: 6 16 26
7: 7 17 27
8: 8 18 28
9: 9 19 29
10: 10 20 30
请注意,:=
的LHS接受列号以及名称。另一方面,这是有效的:
Notice that the LHS of :=
accepts column numbers as well as names. As an aside, this is valid :
DT[, 2:=2] # assign 2 to column 2
用括号括起:=
的LHS $ c>(i):= NA ,告诉它使用变量的值而不是它的名字。
Wrapping the LHS of :=
with parenthesis, (i):=NA
, tells it to use the variable's value rather than its name.
以下但 .N
在 i
中不可用。我已将其添加为功能请求, FR#724 。
UPDATE :现已于2014年7月11日添加到v1.9.3
For the tail I first tried the following but .N
isn't available in i
. I've added that as a feature request, FR#724.
UPDATE: Now added to v1.9.3 on 11 Jul 2014
for (i in 1:3) mat[.N+1-seq_len(cut.tail[i]), (i):=NA]
# .N now works in i
> mat
x y z
1: 1 NA NA
2: 2 NA 22
3: 3 13 23
4: 4 14 24
5: 5 15 25
6: 6 16 26
7: 7 17 27
8: NA 18 28
9: NA 19 NA
10: NA NA NA
>
我们不再需要重复使用符号 mat
:
We no longer have to live with a repetition of the symbol mat
:
> for (i in 1:3) mat[nrow(mat)+1-seq_len(cut.tail[i]), (i):=NA]