如果另一个变量等于R中的设定值,如何用不同变量的值创建一个新变量?

时间:2022-08-22 13:50:47

I have a complicated question that I will try to simplify by simplifying my dataset. Say I have 5 variables:

我有一个复杂的问题,我将通过简化我的数据集来尝试简化。说我有5个变量:

df$Id <- c(1:12)
df$Date <- c(NA,NA,a,a,b,NA,NA,b,c,c,b,a)
df$va <- c(1.1, 1.4, 2.5, ...)     #12 randoms values
df$vb <- c(5.9, 2.3, 4.7, ...)     #12 other random values
df$vc <- c(3.0, 3.3, 3.7, ...)     #12 more random values

Then I want to create a new variable that takes the value from va, vb, or vc if the date is equal to a, b, or c. I had tried a nested if-else, which did not work. I also tried:

然后我想创建一个新变量,如果日期等于a,b或c,则从va,vb或vc获取值。我曾尝试过嵌套的if-else,但是没有用。我也尝试过:

df$new[df$date=='a' & !is.na(df$date)] <- df$va
df$new[df$date=='b' & !is.na(df$date)] <- df$vb
df$new[df$date=='c' & !is.na(df$date)] <- df$vc

This correctly left NA's in the new variable where Date=NA, however the values provided were not from va, vb, or vc, but some other value altogether. How can I get df$new to equal va if the date is 'a', vb if the date is 'b', and vc if the date is 'c'?

这正确地将NA保留在新变量中,其中Date = NA,但是提供的值不是来自va,vb或vc,而是来自其他一些值。如果日期为'a',如果日期为'b',如果日期为'c',则如何获得等于va的df $ new;如果日期为'c',则为vc?

1 个解决方案

#1


1  

You want the ifelse function, which is a vectorized conditional:

你想要ifelse函数,它是一个矢量化条件:

 > x <- c(1, 1, 0, 0, 1)
 > y <- c(1, 2, 3, 4, 5)
 > z <- c(6, 7, 8, 9, 10)
 > ifelse(x == 1, y, z)
 [1] 1 2 8 9 5

You will have to nest calls to this function, like this:

您必须嵌套对此函数的调用,如下所示:

> x_1 <- c(1, 1, 0, 0, 1)
> x_2 <- c(1, 1, 1, 0, 1)
> y_1 <- c(1, 2, 3, 4, 5)
> y_2 <- c(6, 7, 8, 9, 10)
> z <- c(0, 0, 0, 0, 0)
> ifelse(x_1 == 1, y_1,
+   ifelse(x_2 == 1, y_2, z)
+ )
[1] 1 2 8 0 5

Your second attempt would succeed if you made the following modification:

如果您进行了以下修改,您的第二次尝试将成功:

df$new[df$date=='a' & !is.na(df$date)] <- df$va[df$date=='a' & !is.na(df$date)]

To avoid the new variable becoming a list rather than a numeric variable, use %in% in place of ==:

要避免新变量成为列表而不是数字变量,请使用%in%代替==:

df$new[df$date %in% 'a' & !is.na(df$date)] <- df$va[df$date  %in% 'a' & !is.na(df$date)]

#1


1  

You want the ifelse function, which is a vectorized conditional:

你想要ifelse函数,它是一个矢量化条件:

 > x <- c(1, 1, 0, 0, 1)
 > y <- c(1, 2, 3, 4, 5)
 > z <- c(6, 7, 8, 9, 10)
 > ifelse(x == 1, y, z)
 [1] 1 2 8 9 5

You will have to nest calls to this function, like this:

您必须嵌套对此函数的调用,如下所示:

> x_1 <- c(1, 1, 0, 0, 1)
> x_2 <- c(1, 1, 1, 0, 1)
> y_1 <- c(1, 2, 3, 4, 5)
> y_2 <- c(6, 7, 8, 9, 10)
> z <- c(0, 0, 0, 0, 0)
> ifelse(x_1 == 1, y_1,
+   ifelse(x_2 == 1, y_2, z)
+ )
[1] 1 2 8 0 5

Your second attempt would succeed if you made the following modification:

如果您进行了以下修改,您的第二次尝试将成功:

df$new[df$date=='a' & !is.na(df$date)] <- df$va[df$date=='a' & !is.na(df$date)]

To avoid the new variable becoming a list rather than a numeric variable, use %in% in place of ==:

要避免新变量成为列表而不是数字变量,请使用%in%代替==:

df$new[df$date %in% 'a' & !is.na(df$date)] <- df$va[df$date  %in% 'a' & !is.na(df$date)]