如何用R中的其他数据帧替换一个dataframe的一些值?

时间:2022-09-13 16:03:02

I have two large dataframe, with the same column and row, but I need to substitute the NA of the first, based on the second. For example assume data frame "DF1" is

我有两个大的dataframe,有相同的列和行,但是我需要以第一个的NA来代替第一个。例如,假设数据帧“DF1”是

DF1 <- data.frame(a=c(1,NA,3), b=c(4,NA,6))

and "DF2" is

和“DF2”

D2 <- data.frame(a=c(NA,2,NA), b=c(3,5,6))

When there is NA in the "DF1", I want to substitute "DF1" with "DF2", and create a new "DF3", i.e

当“DF1”中有NA时,我想用“DF2”替换“DF1”,并创建一个新的“DF3”,即

a   b
1   4
2   5
3   6

Could you help me with this please?

你能帮我一下吗?

2 个解决方案

#1


3  

This should do the trick:

这应该可以做到:

DF3 <- DF1
replace.bool.matrix <- is.na(DF1)
DF3[replace.bool.matrix] <- DF2[replace.bool.matrix]

Explanation:

解释:

We create DF3, which is a copy of DF1. Then we make a logical matrix replace.bool.matrix which we use to select the values in DF3 to replace, as well as the values in DF2 to replace them with.

我们创建DF3,它是DF1的一个副本。然后我们做一个逻辑矩阵替换。我们用来选择要替换DF3中的值的矩阵,以及用DF2中的值替换它们的矩阵。

This makes use of select operations on data frames, for which there are many tutorials.

这使用了数据帧上的select操作,其中有许多教程。

#2


0  

This is much easier with the match() function:

match()函数更容易实现:

 df1 <- data.frame(steps=c(NA,NA,NA,NA,NA,NA,NA,NA),date=c('2012-10-01','2012-10-01','2012-10-01','2012-10-01','2012-10-01','2012-10-01','2012-10-02','2012-10-02'), interval=c(0,5,10,15,20,25,0,5))

df2 <- data.frame(Interval=c(0,5,10,15,20,25),x=c(1.716,0.339,0.132,0.151,0.075,2.094))
if (is.na(df1$steps)==TRUE) df1$steps <- df2$x[match(df1$interval,df2$Interval)]

#1


3  

This should do the trick:

这应该可以做到:

DF3 <- DF1
replace.bool.matrix <- is.na(DF1)
DF3[replace.bool.matrix] <- DF2[replace.bool.matrix]

Explanation:

解释:

We create DF3, which is a copy of DF1. Then we make a logical matrix replace.bool.matrix which we use to select the values in DF3 to replace, as well as the values in DF2 to replace them with.

我们创建DF3,它是DF1的一个副本。然后我们做一个逻辑矩阵替换。我们用来选择要替换DF3中的值的矩阵,以及用DF2中的值替换它们的矩阵。

This makes use of select operations on data frames, for which there are many tutorials.

这使用了数据帧上的select操作,其中有许多教程。

#2


0  

This is much easier with the match() function:

match()函数更容易实现:

 df1 <- data.frame(steps=c(NA,NA,NA,NA,NA,NA,NA,NA),date=c('2012-10-01','2012-10-01','2012-10-01','2012-10-01','2012-10-01','2012-10-01','2012-10-02','2012-10-02'), interval=c(0,5,10,15,20,25,0,5))

df2 <- data.frame(Interval=c(0,5,10,15,20,25),x=c(1.716,0.339,0.132,0.151,0.075,2.094))
if (is.na(df1$steps)==TRUE) df1$steps <- df2$x[match(df1$interval,df2$Interval)]