在R中的数据框中查找重复的行(基于2列)

问题描述:

我在R中有一个数据框,如下所示:

I have a data frame in R which looks like:

| RIC    | Date                | Open   |
|--------|---------------------|--------|
| S1A.PA | 2011-06-30 20:00:00 | 23.7   |
| ABC.PA | 2011-07-03 20:00:00 | 24.31  |
| EFG.PA | 2011-07-04 20:00:00 | 24.495 |
| S1A.PA | 2011-07-05 20:00:00 | 24.23  |

我想知道RIC和Date的组合是否有重复。在R中有一个功能吗?

I want to know if there's any duplicates regarding to the combination of RIC and Date. Is there a function for that in R?

您可以随时尝试将前两列传递给函数复制

You can always try simply passing those first two columns to the function duplicated:

duplicated(dat[,1:2])

有关更多信息,请通过在控制台键入?复制,查看复制的函数的帮助文件。这将提供以下句子:

assuming your data frame is called dat. For more information, we can consult the help files for the duplicated function by typing ?duplicated at the console. This will provide the following sentences:


确定向量或数据帧的哪些元素是具有较小下标的
元素的重复,并返回一个逻辑向量
,表示哪些元素(行)是重复的。

Determines which elements of a vector or data frame are duplicates of elements with smaller subscripts, and returns a logical vector indicating which elements (rows) are duplicates.

所以 / code>返回一个逻辑向量,然后我们可以使用它来提取 dat的子集

ind <- duplicated(dat[,1:2])
dat[ind,]

,或者您可以跳过单独的分配步骤,只需使用:

or you can skip the separate assignment step and simply use:

dat[duplicated(dat[,1:2]),]