在数据帧或向量中查找非数字数据
我用 read.csv()
读了一些冗长的数据,令我惊讶的是,数据是作为因素而不是数字出现的,所以我猜测至少要有一个数据中的非数字项目.我如何找到这些物品在哪里?
I have read in some lengthy data with read.csv()
, and to my surprise the data is coming out as factors rather than numbers, so I'm guessing there must be at least one non-numeric item in the data. How can I find where these items are?
例如,如果我具有以下数据框:
For example, if I have the following data frame:
df <- data.frame(c(1,2,3,4,"five",6,7,8,"nine",10))
我想知道第5行和第9行包含非数字数据.我该怎么办?
I would like to know that rows 5 and 9 have non-numeric data. How would I do that?
df <- data.frame(c(1,2,3,4,"five",6,7,8,"nine",10))
诀窍是知道通过 as.numeric(as.character(.))
转换为数字会将非数字转换为 NA
.
The trick is knowing that converting to numeric via as.numeric(as.character(.))
will convert non-numbers to NA
.
which(is.na(as.numeric(as.character(df[[1]]))))
## 5 9
(仅使用 as.numeric(df [[1]])
不起作用-只是删除保留数字代码的级别).
(just using as.numeric(df[[1]])
doesn't work - it just drops the levels leaving the numeric codes).
您可以选择隐藏警告:
which.nonnum <- function(x) {
which(is.na(suppressWarnings(as.numeric(as.character(x)))))
}
which.nonnum(df[[1]])
为更加小心,您还应该在转换前检查这些值是否不适用:
To be more careful, you should also check that the values weren't NA before conversion:
which.nonnum <- function(x) {
badNum <- is.na(suppressWarnings(as.numeric(as.character(x))))
which(badNum & !is.na(x))
}