R:从数据列表中提取列。

I am wondering how to manipulate a list containing data.frames stored in a tibble.

我想知道如何操作存储在tibble中的data.frame列表。

Specifically, I would like to extract two columns from a data.frame that are stored in a tibble list column.

具体地说，我想从一个data.frame中提取两列，它们存储在一个tibble list列中。

I would like to go from this tibble c

我想从这里开始

random_data<-list(a=letters[1:10],b=LETTERS[1:10])
x<-as.data.frame(random_data, stringsAsFactors=FALSE)
y<-list()
y[[1]]<-x[1,,drop=FALSE]
y[[3]]<-x[2,,drop=FALSE]
c<-tibble(z=c(1,2,3),my_data=y)

to this tibble d

这个宠物猫d

d<-tibble(z=c(1,2,3),a=c('a',NA,'b'),b=c('A',NA,'B'))

thanks

谢谢

Iain

伊恩•

4 个解决方案

#1

You could create a function f to change out the NULL values, then apply it to the my_data column and finish with unnest.

您可以创建一个函数f来更改NULL值，然后将其应用到my_data列，并以unnest结束。

library(dplyr); library(tidyr)

unnest(mutate(c, my_data = lapply(my_data, f)))
# # A tibble: 3 x 3
#       z     a     b
#   <dbl> <chr> <chr>
# 1     1     a     A
# 2     2  <NA>  <NA>
# 3     3     b     B

Where f is a helper function to change out the NULL values, and is defined as

f是一个帮助函数来改变空值，定义为

f <- function(x) {
    if(is.null(x)) data.frame(a = NA, b = NA) else x
}

#2

c2 is the final output.

c2是最终输出。

library(tidyverse)

c2 <- c %>%
  filter(!map_lgl(my_data, is.null)) %>%
  unnest() %>%
  right_join(c, by = "z") %>%
  select(-my_data)

#3

I think this does the trick with d the requested tibble:

我认为这是对d的要求的小费:

library(dplyr)

new.y <- lapply(y, function(x) if(is.null(x)) data.frame(a = NA, b = NA) else x)
d <- cbind(z = c(1, 2, 3), bind_rows(new.y)) %>% tbl_df()


# # A tibble: 3 x 3
#     z      a      b
#  <dbl> <fctr> <fctr>
# 1   1      a      A
# 2   2     NA     NA
# 3   3      b      B

#4

Do you know your column names ahead of time?

你知道你前面的列名吗?

extract_column <- function( d, column_name ) {
  if( is.null(d) ) {
    NA_character_
  } else {
    as.character(d[[column_name]])
  }  
}


cc %>% 
  dplyr::mutate(
    a = purrr::map_chr(.$my_data, extract_column, column_name="a"),
    b = purrr::map_chr(.$my_data, extract_column, column_name="b")
  ) %>% 
  dplyr::select(-my_data)

(I renamed your c tibble to cc so it can't collide with c().)

(我将c tibble改名为cc，这样它就不会与c()发生冲突。)

#1