R中的列之间的平均值,不包括NA

时间:2022-12-13 22:58:19

I can't imagine I'm the first person with this question, but I haven't found a solution yet (here or elsewhere).

我无法想象我是第一个有这个问题的人,但我还没有找到解决方案(这里或其他地方)。

I have a few columns, which I want to average in R. The only minimally tricky aspect is that some columns contain NAs.

我有几列,我想在R中平均。唯一最小的棘手方面是一些列包含NA。

For example:

例如:

Trait Col1 Col2 Col3
DF    23   NA   23
DG    2    2    2
DH    NA   9    9

I want to create a Col4 that averages the entries in the first 3 columns, ignoring the NAs. So:

我想创建一个Col4,它平均前3列中的条目,忽略了NA。所以:

 Trait Col1 Col2 Col3 Col4
 DF    23   NA   23   23
 DG    2    2    2    2
 DH    NA   9    9    9 

Ideally something like this would work:

理想情况下这样的事情会起作用:

data$Col4 <- mean(data$Chr1, data$Chr2, data$Chr3, na.rm=TRUE)

but it doesn't.

但事实并非如此。

1 个解决方案

#1


21  

You want rowMeans() but importantly note it has a na.rm argument that you want to set to TRUE. E.g.:

你想要rowMeans(),但重要的是要注意它有一个你想要设置为TRUE的na.rm参数。例如。:

> mat <- matrix(c(23,2,NA,NA,2,9,23,2,9), ncol = 3)
> mat
     [,1] [,2] [,3]
[1,]   23   NA   23
[2,]    2    2    2
[3,]   NA    9    9
> rowMeans(mat)
[1] NA  2 NA
> rowMeans(mat, na.rm = TRUE)
[1] 23  2  9

To match your example:

要匹配您的示例:

> dat <- data.frame(Trait = c("DF","DG","DH"), mat)
> names(dat) <- c("Trait", paste0("Col", 1:3))
> dat
  Trait Col1 Col2 Col3
1    DF   23   NA   23
2    DG    2    2    2
3    DH   NA    9    9
> dat <- transform(dat, Col4 = rowMeans(dat[,-1], na.rm = TRUE))
> dat
  Trait Col1 Col2 Col3 Col4
1    DF   23   NA   23   23
2    DG    2    2    2    2
3    DH   NA    9    9    9

#1


21  

You want rowMeans() but importantly note it has a na.rm argument that you want to set to TRUE. E.g.:

你想要rowMeans(),但重要的是要注意它有一个你想要设置为TRUE的na.rm参数。例如。:

> mat <- matrix(c(23,2,NA,NA,2,9,23,2,9), ncol = 3)
> mat
     [,1] [,2] [,3]
[1,]   23   NA   23
[2,]    2    2    2
[3,]   NA    9    9
> rowMeans(mat)
[1] NA  2 NA
> rowMeans(mat, na.rm = TRUE)
[1] 23  2  9

To match your example:

要匹配您的示例:

> dat <- data.frame(Trait = c("DF","DG","DH"), mat)
> names(dat) <- c("Trait", paste0("Col", 1:3))
> dat
  Trait Col1 Col2 Col3
1    DF   23   NA   23
2    DG    2    2    2
3    DH   NA    9    9
> dat <- transform(dat, Col4 = rowMeans(dat[,-1], na.rm = TRUE))
> dat
  Trait Col1 Col2 Col3 Col4
1    DF   23   NA   23   23
2    DG    2    2    2    2
3    DH   NA    9    9    9