当列可能不存在时用 $ 进行子集化
我正在编写一个由 ifelse 语句组成的函数,该函数依赖于作为函数输入的不同列中的值:
I am writing a function made up of ifelse statements that depends on values in different columns of the that are the function's input:
counter=function(df){
df$total2=ifelse(df$x>=100,df$total+10,df$total)
df$total3=ifelse(df$y>=200,df$total2+10,df$total2)
}
看起来我这样做的方式效率很低,但我还没有想到避免覆盖计算的方法.
It seems like the way I'm doing it is quite inefficient, but I haven't thought of a way to avoid overwriting the calculations.
但更紧迫的是,我想使用这个函数的一些 dfs 没有 x 列和 y 列.当我在这些上运行它时,会出现以下错误;
But more pressingly, some of the dfs I'd like to use this function on do not have both column x and column y. When I run it on these, the following error sappears;
$<-.data.frame
(*tmp*
, "total3", value = logical(0)) 中的错误:替换有 0 行,数据有 74
Error in $<-.data.frame
(*tmp*
, "total3", value = logical(0)) :
replacement has 0 rows, data has 74
有没有办法重写它以允许没有所有列的数据框?
Is there a way to rewrite this to allow for dataframes that don't have all of the columns?
谢谢.
你可以使用标准的if
来查看列是否存在
You can just use a standard if
to see if a column exists
counter <- function(df) {
if ("x" %in% names(df) ) {
df<- transform(df, total2=ifelse(x>=100,total+10,total)
}
if("y" %in% names(df) ) {
df <- transform(df, total3=ifelse(y>=200,total2+10,total2)
}
}
虽然看起来您的数据可能采用宽"格式,但以高"格式处理可能更容易.您可能想考虑重塑您的数据.
Though it seems like your data might be in a "wide" format when it may be easier to work with in the a "tall" format. You might want to look into reshaping your data.