将列添加到R中的空数据框

时间:2022-07-08 19:35:30

I have searched extensively but not found an answer to this question on Stack Overflow.

我已经广泛搜索但没有在Stack Overflow上找到这个问题的答案。

Lets say I have a data frame a.

假设我有一个数据框a。

I define:

我定义:

a <- NULL
a <- as.data.frame(a)

If I wanted to add a column to this data frame as so:

如果我想在此数据框中添加一列,如下所示:

a$col1 <- c(1,2,3)

I get the following error:

我收到以下错误:

Error in `$<-.data.frame`(`*tmp*`, "a", value = c(1, 2, 3)) : 
    replacement has 3 rows, data has 0

Why is the row dimension fixed but the column is not?

为什么行维度固定但列不是?

How do I change the number of rows in a data frame?

如何更改数据框中的行数?

If I do this (inputting the data into a list first and then converting to a df), it works fine:

如果我这样做(首先将数据输入列表然后转换为df),它可以正常工作:

a <- NULL
a$col1 <- c(1,2,3)
a <- as.data.frame(a)

2 个解决方案

#1


7  

The row dimension is not fixed, but data.frames are stored as list of vectors that are constrained to have the same length. You cannot add col1 to a because col1 has three values (rows) and a has zero, thereby breaking the constraint. R does not by default auto-vivify values when you attempt to extend the dimension of a data.frame by adding a column that is longer than the data.frame. The reason that the second example works is that col1 is the only vector in the data.frame so the data.frame is initialized with three rows.

行维度不固定,但data.frames存储为受限制为具有相同长度的向量列表。您不能将col1添加到a,因为col1有三个值(行)且a有零,从而打破了约束。当您尝试通过添加比data.frame更长的列来扩展data.frame的维度时,R不会默认自动生成值。第二个示例有效的原因是col1是data.frame中唯一的向量,因此data.frame初始化为三行。

If you want to automatically have the data.frame expand, you can use the following function:

如果要自动展开data.frame,可以使用以下函数:

cbind.all <- function (...) 
{
    nm <- list(...)
    nm <- lapply(nm, as.matrix)
    n <- max(sapply(nm, nrow))
    do.call(cbind, lapply(nm, function(x) rbind(x, matrix(, n - 
        nrow(x), ncol(x)))))
}

This will fill missing values with NA. And you would use it like: cbind.all( df, a )

这将使用NA填充缺失值。你可以使用它:cbind.all(df,a)

#2


1  

You could also do something like this where I read in data from multiple files, grab the column I want, and store it in the dataframe. I check whether the dataframe has anything in it, and if it doesn't, create a new one rather than getting the error about mismatched number of rows:

您还可以执行以下操作:我从多个文件中读取数据,获取所需的列,并将其存储在数据框中。我检查数据框中是否有任何内容,如果没有,请创建一个新的,而不是获得有关行数不匹配的错误:

readCounts = data.frame()

for(f in names(files)){
    d = read.table(files[f], header=T, as.is=T)
    d2 = round(data.frame(d$NumReads))
    colnames(d2) = f
    if(ncol(readCounts) == 0){
        readCounts = d2
        rownames(readCounts) = d$Name
    } else{
        readCounts = cbind(readCounts, d2)
    }
}

#1


7  

The row dimension is not fixed, but data.frames are stored as list of vectors that are constrained to have the same length. You cannot add col1 to a because col1 has three values (rows) and a has zero, thereby breaking the constraint. R does not by default auto-vivify values when you attempt to extend the dimension of a data.frame by adding a column that is longer than the data.frame. The reason that the second example works is that col1 is the only vector in the data.frame so the data.frame is initialized with three rows.

行维度不固定,但data.frames存储为受限制为具有相同长度的向量列表。您不能将col1添加到a,因为col1有三个值(行)且a有零,从而打破了约束。当您尝试通过添加比data.frame更长的列来扩展data.frame的维度时,R不会默认自动生成值。第二个示例有效的原因是col1是data.frame中唯一的向量,因此data.frame初始化为三行。

If you want to automatically have the data.frame expand, you can use the following function:

如果要自动展开data.frame,可以使用以下函数:

cbind.all <- function (...) 
{
    nm <- list(...)
    nm <- lapply(nm, as.matrix)
    n <- max(sapply(nm, nrow))
    do.call(cbind, lapply(nm, function(x) rbind(x, matrix(, n - 
        nrow(x), ncol(x)))))
}

This will fill missing values with NA. And you would use it like: cbind.all( df, a )

这将使用NA填充缺失值。你可以使用它:cbind.all(df,a)

#2


1  

You could also do something like this where I read in data from multiple files, grab the column I want, and store it in the dataframe. I check whether the dataframe has anything in it, and if it doesn't, create a new one rather than getting the error about mismatched number of rows:

您还可以执行以下操作:我从多个文件中读取数据,获取所需的列,并将其存储在数据框中。我检查数据框中是否有任何内容,如果没有,请创建一个新的,而不是获得有关行数不匹配的错误:

readCounts = data.frame()

for(f in names(files)){
    d = read.table(files[f], header=T, as.is=T)
    d2 = round(data.frame(d$NumReads))
    colnames(d2) = f
    if(ncol(readCounts) == 0){
        readCounts = d2
        rownames(readCounts) = d$Name
    } else{
        readCounts = cbind(readCounts, d2)
    }
}