如何在R中更有效地将一个列表转换成一个矩阵?

时间:2022-09-12 11:44:21

I have a list of length 130,000 where each element is a character vector of length 110. I would like to convert this list to a matrix with dimension 1,430,000*10. How can I do it more efficiently? My code is :

我有一个长度为13万的列表,每个元素都是长度为110的字符向量。我想把这个列表转换成一个具有1430,000 *10的矩阵。我怎样才能更有效率地做呢?我的代码是:

output=NULL
for(i in 1:length(z)) output=rbind(output,matrix(z[[i]],ncol=10,byrow=T))

4 个解决方案

#1


100  

This should be equivalent to your current code, only a lot faster:

这应该相当于您当前的代码,只不过要快得多:

output <- matrix(unlist(z), ncol = 10, byrow = TRUE)

#2


11  

I think you want

我认为你想要的

output <- do.call(rbind,lapply(z,matrix,ncol=10,byrow=TRUE))

i.e. combining @BlueMagister's use of do.call(rbind,...) with an lapply statement to convert the individual list elements into 11*10 matrices ...

例如,将@BlueMagister使用的do.call(rbind,…)与lapply语句结合在一起,将单个列表元素转换成11*10矩阵…

Benchmarks (showing @flodel's unlist solution is 5x faster than mine, and 230x faster than the original approach ...)

基准测试(显示@flodel的unlist解决方案比我的快5倍,比原始方法快230倍…)

n <- 1000
z <- replicate(n,matrix(1:110,ncol=10,byrow=TRUE),simplify=FALSE)
library(rbenchmark)
origfn <- function(z) {
    output <- NULL 
    for(i in 1:length(z))
        output<- rbind(output,matrix(z[[i]],ncol=10,byrow=TRUE))
}
rbindfn <- function(z) do.call(rbind,lapply(z,matrix,ncol=10,byrow=TRUE))
unlistfn <- function(z) matrix(unlist(z), ncol = 10, byrow = TRUE)

##          test replications elapsed relative user.self sys.self 
## 1   origfn(z)          100  36.467  230.804    34.834    1.540  
## 2  rbindfn(z)          100   0.713    4.513     0.708    0.012 
## 3 unlistfn(z)          100   0.158    1.000     0.144    0.008 

If this scales appropriately (i.e. you don't run into memory problems), the full problem would take about 130*0.2 seconds = 26 seconds on a comparable machine (I did this on a 2-year-old MacBook Pro).

如果这个范围合适(例如,你不会遇到内存问题),那么在一台类似的机器上,完整的问题大约需要130*0.2秒= 26秒(我在一台两岁的MacBook Pro上做了这个)。

#3


5  

It would help to have sample information about your output. Recursively using rbind on bigger and bigger things is not recommended. My first guess at something that would help you:

有关于输出的示例信息会有帮助。不建议在越来越大的事情上递归地使用rbind。我的第一个猜想可能会对你有帮助:

z <- list(1:3,4:6,7:9)
do.call(rbind,z)

See a related question for more efficiency, if needed.

如果需要,请查看相关问题以获得更高的效率。

#4


-2  

you can use as.matrix as below:

您可以使用。矩阵如下:

output <- as.matrix(z)

#1


100  

This should be equivalent to your current code, only a lot faster:

这应该相当于您当前的代码,只不过要快得多:

output <- matrix(unlist(z), ncol = 10, byrow = TRUE)

#2


11  

I think you want

我认为你想要的

output <- do.call(rbind,lapply(z,matrix,ncol=10,byrow=TRUE))

i.e. combining @BlueMagister's use of do.call(rbind,...) with an lapply statement to convert the individual list elements into 11*10 matrices ...

例如,将@BlueMagister使用的do.call(rbind,…)与lapply语句结合在一起,将单个列表元素转换成11*10矩阵…

Benchmarks (showing @flodel's unlist solution is 5x faster than mine, and 230x faster than the original approach ...)

基准测试(显示@flodel的unlist解决方案比我的快5倍,比原始方法快230倍…)

n <- 1000
z <- replicate(n,matrix(1:110,ncol=10,byrow=TRUE),simplify=FALSE)
library(rbenchmark)
origfn <- function(z) {
    output <- NULL 
    for(i in 1:length(z))
        output<- rbind(output,matrix(z[[i]],ncol=10,byrow=TRUE))
}
rbindfn <- function(z) do.call(rbind,lapply(z,matrix,ncol=10,byrow=TRUE))
unlistfn <- function(z) matrix(unlist(z), ncol = 10, byrow = TRUE)

##          test replications elapsed relative user.self sys.self 
## 1   origfn(z)          100  36.467  230.804    34.834    1.540  
## 2  rbindfn(z)          100   0.713    4.513     0.708    0.012 
## 3 unlistfn(z)          100   0.158    1.000     0.144    0.008 

If this scales appropriately (i.e. you don't run into memory problems), the full problem would take about 130*0.2 seconds = 26 seconds on a comparable machine (I did this on a 2-year-old MacBook Pro).

如果这个范围合适(例如,你不会遇到内存问题),那么在一台类似的机器上,完整的问题大约需要130*0.2秒= 26秒(我在一台两岁的MacBook Pro上做了这个)。

#3


5  

It would help to have sample information about your output. Recursively using rbind on bigger and bigger things is not recommended. My first guess at something that would help you:

有关于输出的示例信息会有帮助。不建议在越来越大的事情上递归地使用rbind。我的第一个猜想可能会对你有帮助:

z <- list(1:3,4:6,7:9)
do.call(rbind,z)

See a related question for more efficiency, if needed.

如果需要,请查看相关问题以获得更高的效率。

#4


-2  

you can use as.matrix as below:

您可以使用。矩阵如下:

output <- as.matrix(z)