如何使用单个值删除数据框列

时间:2022-12-08 07:59:26

Let say I have the following data frame in R:

假设我在R中有以下数据框:

df1 <- data.frame(Item_Name = c("test1","test2","test3"), D_1=c(1,0,1),
                  D_2=c(1,1,1), D_3=c(11,3,1))

I would like to create a function that would delete columns with no variance (e.g. in this case, it would remove column D_2 because it has only 1 value)

我想创建一个删除没有方差的列的函数(例如,在这种情况下,它将删除列D_2,因为它只有1个值)

I know that I could check it by hand, but in reality my data is very large and I would like to automate it. Any idea?

我知道我可以手工检查,但实际上我的数据非常大,我想自动化它。任何的想法?

2 个解决方案

#1


10  

Filter is a useful function here. I will filter only for those where there is more than 1 unique value.

过滤器在这里是一个有用的功能我将仅过滤那些有超过1个唯一值的那些。

i.e.

Filter(function(x)(length(unique(x))>1), df1)

##   Item_Name D_1 D_3
## 1     test1   1  11
## 2     test2   0   3
## 3     test3   1   1

#2


8  

You can do:

你可以做:

df1[c(TRUE, lapply(df1[-1], var, na.rm = TRUE) != 0)]
#   Item_Name D_1 D_3
# 1     test1   1  11
# 2     test2   0   3
# 3     test3   1   1

where the lapply piece tells you what variables have some variance:

lapply片段告诉你哪些变量有一些变化:

lapply(df1[-1], var, na.rm = TRUE) != 0
#   D_1   D_2   D_3 
#   TRUE FALSE  TRUE 

#1


10  

Filter is a useful function here. I will filter only for those where there is more than 1 unique value.

过滤器在这里是一个有用的功能我将仅过滤那些有超过1个唯一值的那些。

i.e.

Filter(function(x)(length(unique(x))>1), df1)

##   Item_Name D_1 D_3
## 1     test1   1  11
## 2     test2   0   3
## 3     test3   1   1

#2


8  

You can do:

你可以做:

df1[c(TRUE, lapply(df1[-1], var, na.rm = TRUE) != 0)]
#   Item_Name D_1 D_3
# 1     test1   1  11
# 2     test2   0   3
# 3     test3   1   1

where the lapply piece tells you what variables have some variance:

lapply片段告诉你哪些变量有一些变化:

lapply(df1[-1], var, na.rm = TRUE) != 0
#   D_1   D_2   D_3 
#   TRUE FALSE  TRUE