
时间:2022-11-08 16:57:34

Is it possible to have multiple data frames to be stored into one data structure and process it later by each data frame? i.e. example


df1 <- data.frame(c(1,2,3), c(4,5,6))
df2 <- data.frame(c(11,22,33), c(44,55,66))

.. then I would like to have them added in a data structure, such that I can loop through that data structure retrieving each data frame one at a time and process it, something like

. .然后,我希望将它们添加到数据结构中,这样我就可以在数据结构中循环,每次检索一个数据帧,并处理它,类似这样

 for ( iterate through the data structure) # this gives df1, then df2
    write data frame to a file

I cannot find any such data structure in R. Can anyone point me to any code that illustrates the same functionality?


3 个解决方案



Just put the data.frames in a list. A plus is that a list works really well with apply style loops. For example, if you want to save the data.frame's, you can use mapply:


l = list(df1, df2)
mapply(write.table, x = l, file = c("df1.txt", "df2.txt"))

If you like apply style loops (and you will, trust me :)) please take a look at the epic plyr package. It might not be the fastest package (look data.table for fast), but it drips with syntactic sugar.

如果您喜欢应用样式循环(相信我),请查看epic plyr包。它可能不是最快的包(查看数据)。表为快速),但它滴与语法糖。



Lists can be used to hold almost anything, including data.frames:


## Versatility of lists
l <- list(file(), new.env(), data.frame(a=1:4))

For writing out multiple data objects stored in a list, lapply() is your friend:


ll <- list(df1=df1, df2=df2)
## Write out as *.csv files
lapply(names(ll), function(X) write.csv(ll[[X]], file=paste0(X, ".csv")))
## Save in *.Rdata files
lapply(names(ll), function(X) {
    assign(X, ll[[X]]) 
    save(list=X, file=paste0(X, ".Rdata"))



What you are looking for is a list. You can use a function like lapply to treat each of your data frames in the same manner sperately. However, there might be cases where you need to pass your list of data frames to a function that handles the data frames in relation to each other. In this case lapply doesn't help you.


That's why it is important to note how you can access and iterate the data frames in your list. It's done like this:


mylist[[data frame]][row,column]

Note the double brackets around your data frame index. So for your example it would be


df1 <- data.frame(c(1,2,3), c(4,5,6))
df2 <- data.frame(c(11,22,33), c(44,55,66))

mylist[[1]][1,2] would return 4, whereas mylist[1][1,2] would return NULL. It took a while for me to find this, so I thought it might be helpful to post here.




Just put the data.frames in a list. A plus is that a list works really well with apply style loops. For example, if you want to save the data.frame's, you can use mapply:


l = list(df1, df2)
mapply(write.table, x = l, file = c("df1.txt", "df2.txt"))

If you like apply style loops (and you will, trust me :)) please take a look at the epic plyr package. It might not be the fastest package (look data.table for fast), but it drips with syntactic sugar.

如果您喜欢应用样式循环(相信我),请查看epic plyr包。它可能不是最快的包(查看数据)。表为快速),但它滴与语法糖。



Lists can be used to hold almost anything, including data.frames:


## Versatility of lists
l <- list(file(), new.env(), data.frame(a=1:4))

For writing out multiple data objects stored in a list, lapply() is your friend:


ll <- list(df1=df1, df2=df2)
## Write out as *.csv files
lapply(names(ll), function(X) write.csv(ll[[X]], file=paste0(X, ".csv")))
## Save in *.Rdata files
lapply(names(ll), function(X) {
    assign(X, ll[[X]]) 
    save(list=X, file=paste0(X, ".Rdata"))



What you are looking for is a list. You can use a function like lapply to treat each of your data frames in the same manner sperately. However, there might be cases where you need to pass your list of data frames to a function that handles the data frames in relation to each other. In this case lapply doesn't help you.


That's why it is important to note how you can access and iterate the data frames in your list. It's done like this:


mylist[[data frame]][row,column]

Note the double brackets around your data frame index. So for your example it would be


df1 <- data.frame(c(1,2,3), c(4,5,6))
df2 <- data.frame(c(11,22,33), c(44,55,66))

mylist[[1]][1,2] would return 4, whereas mylist[1][1,2] would return NULL. It took a while for me to find this, so I thought it might be helpful to post here.
