使用因子“”删除数据框中的行

时间:2022-10-01 08:54:12

I have a dataframe like x where the column genes is a factor. I want to remove all the rows where column genes has nothing. So in table X I want to remove row 4. Is there a way to do this for a large dataframe?

我有一个像x这样的数据框,其中列基因是一个因素。我想删除列基因什么都没有的所有行。所以在表X中我想删除第4行。有没有办法为大型数据帧执行此操作?

X 
names   values   genes
1 A  0.2876113  EEF1A1 
2 B  0.6681894   GAPDH
3 C  0.1375420 SLC35E2
4 D -1.9063386        
5 E -0.4949905   RPS28

Finally result:

最后结果:

X 
names   values   genes
1 A  0.2876113  EEF1A1 
2 B  0.6681894   GAPDH
3 C  0.1375420 SLC35E2
5 E -0.4949905   RPS28

Thank you all!

谢谢你们!

2 个解决方案

#1


22  

It's not completely obvious from your question what the empty values are, but you should be able to adopt the solution below (here I assume the 'empty' values are empty strings):

从您的问题来看,空值是什么并不完全明显,但您应该能够采用下面的解决方案(这里我假设'空'值是空字符串):

toBeRemoved<-which(X$genes=="")
X<-X[-toBeRemoved,]

#2


10  

@Nick Sabbe provided a great answer, but it has one caveat:

@Nick Sabbe提供了一个很好的答案,但有一点需要注意:

Using -which(...) is a neat trick to (sometimes) speed up the subsetting operation when there are only a few elements to remove.

使用-which(...)是一个巧妙的技巧(有时)加速子集操作,只需要删除几个元素。

...But if there are no elements to remove, it fails!

...但如果没有要删除的元素,它就会失败!

So, if X$genes does not contain any empty strings, which will return an empty integer vector. Negating that is still an empty vector. And X[integer(0)] returns an empty data.frame!

因此,如果X $基因不包含任何空字符串,则返回空整数向量。否定这仍然是一个空的向量。并且X [integer(0)]返回一个空的data.frame!

toBeRemoved <- which(X$genes=="")
if (length(toBeRemoved>0)) { # MUST check for 0-length
    X<-X[-toBeRemoved,]
}

Or, if the speed gain isn't important, simply:

或者,如果速度增益不重要,只需:

X<-X[X$genes!="",]

Or, as @nullglob pointed out,

或者,正如@nullglob指出的那样,

subset(X, genes != "")

#1


22  

It's not completely obvious from your question what the empty values are, but you should be able to adopt the solution below (here I assume the 'empty' values are empty strings):

从您的问题来看,空值是什么并不完全明显,但您应该能够采用下面的解决方案(这里我假设'空'值是空字符串):

toBeRemoved<-which(X$genes=="")
X<-X[-toBeRemoved,]

#2


10  

@Nick Sabbe provided a great answer, but it has one caveat:

@Nick Sabbe提供了一个很好的答案,但有一点需要注意:

Using -which(...) is a neat trick to (sometimes) speed up the subsetting operation when there are only a few elements to remove.

使用-which(...)是一个巧妙的技巧(有时)加速子集操作,只需要删除几个元素。

...But if there are no elements to remove, it fails!

...但如果没有要删除的元素,它就会失败!

So, if X$genes does not contain any empty strings, which will return an empty integer vector. Negating that is still an empty vector. And X[integer(0)] returns an empty data.frame!

因此,如果X $基因不包含任何空字符串,则返回空整数向量。否定这仍然是一个空的向量。并且X [integer(0)]返回一个空的data.frame!

toBeRemoved <- which(X$genes=="")
if (length(toBeRemoved>0)) { # MUST check for 0-length
    X<-X[-toBeRemoved,]
}

Or, if the speed gain isn't important, simply:

或者,如果速度增益不重要,只需:

X<-X[X$genes!="",]

Or, as @nullglob pointed out,

或者,正如@nullglob指出的那样,

subset(X, genes != "")