根据另一列的可用值动态替换列的值

时间:2022-09-16 00:41:57

Suppose I have this data frame

假设我有这个数据框架

set.seed(2)
df <- data.frame(c1 = sample(c(0:3,NA), 50, replace = T), c2 = sample(c(0:3,NA), 50, replace = T),
                 c3 = sample(c(0:3,NA), 50, replace = T), c4 = sample(c(0:3,NA), 50, replace = T))

head(df)
  c1 c2 c3 c4
1  0  0  1  0
2  3  0  2  1
3  2  3 NA NA
4  0 NA NA  1
5 NA  1  1  3
6 NA NA  2  1

When c4 is 0, I'd like to replace it with the next available non-NA value in c3. If c3 is NA, then c2 and so on.

当c4为0时,我想用c3中下一个可用的非na值替换它。如果c3是NA,那么c2等等。

I'm trying to learn how to do it, so don't just throw in the answer! If it's alright, suggest possible solutions. Thanks in advance.

我正在努力学习如何去做,所以不要只给出答案!如果可以,提出可能的解决方案。提前谢谢。

Edit:

编辑:

Expected output:

预期的输出:

head(df)
  c1 c2 c3 c4
1  0  0  1  1 # This would be the only difference with the head output from above
2  3  0  2  1
3  2  3 NA NA
4  0 NA NA  1
5 NA  1  1  3
6 NA NA  2  1

1 个解决方案

#1


3  

This is how you can do it without looping through each row:

这就是如何在不循环遍历每一行的情况下做到这一点:

c4 <- ncol(df)
inds <- max.col(!is.na(df[,-c4]) & df[,-c4]!=0, "last")
zeroinds <- which((df[,c4]==0)==T)
df[zeroinds,c4] <- df[cbind(zeroinds,inds[zeroinds])]

head(df, 10)

   # c1 c2 c3 c4
# 1   0  0  1  1
# 2   3  0  2  1
# 3   2  3 NA NA
# 4   0 NA NA  1
# 5  NA  1  1  3
# 6  NA NA  2  1
# 7   0  3 NA NA
# 8  NA NA  2  2
# 9   2  3  0  3
# 10  2  3  0  1

Here is how:

这里是:

  1. c4 as the last column
  2. c4作为最后一列
  3. We find the first non-NA and non-zero value per row before c4
  4. 我们在c4之前找到第一个非na和非零值
  5. Find those rows with zero in c4 and put it in zeroinds
  6. 找到那些在c4中为0的行,并把它们放到0中
  7. Replace zeros at zeroinds with the first non-NA and non-zero value per row
  8. 用每行的第一个非na和非零值替换0

#1


3  

This is how you can do it without looping through each row:

这就是如何在不循环遍历每一行的情况下做到这一点:

c4 <- ncol(df)
inds <- max.col(!is.na(df[,-c4]) & df[,-c4]!=0, "last")
zeroinds <- which((df[,c4]==0)==T)
df[zeroinds,c4] <- df[cbind(zeroinds,inds[zeroinds])]

head(df, 10)

   # c1 c2 c3 c4
# 1   0  0  1  1
# 2   3  0  2  1
# 3   2  3 NA NA
# 4   0 NA NA  1
# 5  NA  1  1  3
# 6  NA NA  2  1
# 7   0  3 NA NA
# 8  NA NA  2  2
# 9   2  3  0  3
# 10  2  3  0  1

Here is how:

这里是:

  1. c4 as the last column
  2. c4作为最后一列
  3. We find the first non-NA and non-zero value per row before c4
  4. 我们在c4之前找到第一个非na和非零值
  5. Find those rows with zero in c4 and put it in zeroinds
  6. 找到那些在c4中为0的行,并把它们放到0中
  7. Replace zeros at zeroinds with the first non-NA and non-zero value per row
  8. 用每行的第一个非na和非零值替换0