R:如何从数据框中提取列表?
考虑这个简单的示例
> weird_df <- data_frame(col1 =c('hello', 'world', 'again'),
+ col_weird = list(list(12,23), list(23,24), NA))
>
> weird_df
# A tibble: 3 x 2
col1 col_weird
<chr> <list>
1 hello <list [2]>
2 world <list [2]>
3 again <lgl [1]>
我需要提取 col_weird
中的值>。我怎样才能做到这一点?我看到了如何在Python中而不是R中执行此操作。预期输出为:
I need to extract the values in the col_weird
. How can I do that? I see how to do that in Python but not in R. Expected output is:
> good_df
# A tibble: 3 x 3
col1 tic toc
<chr> <dbl> <dbl>
1 hello 12 23
2 world 23 24
3 again NA NA
如果将列表列折叠为字符串,则可以使用 tidyr 中的 separate
。我使用 purrr 中的 map
遍历列表列,并使用 toString
创建一个字符串
If you collapse the list column into a string you can use separate
from tidyr. I used map
from purrr to loop through the list column and create a string with toString
.
library(tidyr)
library(purrr)
weird_df %>%
mutate(col_weird = map(col_weird, toString ) ) %>%
separate(col_weird, into = c("tic", "toc"), convert = TRUE)
# A tibble: 3 x 3
col1 tic toc
* <chr> <int> <int>
1 hello 12 23
2 world 23 24
3 again NA NA
实际上,您可以直接使用 separate
而不使用 toString
部分,但最终以列表作为其中之一
You can actually use separate
directly without the toString
part but you end up with "list" as one of the values.
weird_df %>%
separate(col_weird, into = c("list", "tic", "toc"), convert = TRUE) %>%
select(-list)
这使我进入了 tidyr :: extract
,它可以与正确的正则表达式配合使用。但是,如果您的列表列更复杂,写出正则表达式可能会很麻烦。
This led me to tidyr::extract
, which works fine with the right regular expression. If your list column was more complicated, though, writing out the regular expression might be a pain.
weird_df %>%
extract(col_weird, into = c("tic", "toc"), regex = "([[:digit:]]+), ([[:digit:]]+)", convert = TRUE)