检测具有不同顺序的相同元素的行[重复]

时间:2022-10-26 07:28:38

This question already has an answer here:

这个问题在这里已有答案:

I need to make a matrix/data frame containing all combinations of the elements in two vectors. All combinations must be unique, and include different elements. I know I can use the following to make a list of all combinations:

我需要制作一个矩阵/数据框,其中包含两个向量中元素的所有组合。所有组合必须是唯一的,并包含不同的元素。我知道我可以使用以下内容列出所有组合:

a<-c("cat","dog","cow")
b<-c("dog","cow","sheep")
combination<-as.matrix(expand.grid(a,b))

And that I can remove entries where both elements are the same using this:

并且我可以使用以下方法删除两个元素相同的条目:

combination1<-combination[combination[,1]!=combination[,2],]

Which gives the following output:

这给出了以下输出:

> combination1
     Var1  Var2   
[1,] "cat" "dog"  
[2,] "cow" "dog"  
[3,] "cat" "cow"  
[4,] "dog" "cow"  
[5,] "cat" "sheep"
[6,] "dog" "sheep"
[7,] "cow" "sheep"

What I need is to detect/remove rows with the same strings, but in a different order (rows 2 and 4 are "cow,dog", and "dog,cow". Is there a simple way to do this in R? I'm writing a script to test interactions between genes in barley which is very lengthy, and I want to avoid testing the same combination twice. Any help would be appreciated.

我需要的是检测/删除具有相同字符串的行,但顺序不同(第2行和第4行是“牛,狗”和“狗,牛”。在R中有一种简单的方法吗?我正在编写一个脚本来测试大麦中基因之间的相互作用是非常冗长的,我想避免两次测试相同的组合。任何帮助都将受到赞赏。

1 个解决方案

#1


1  

You could try sorting the rows, then taking the unique ones:

您可以尝试对行进行排序,然后选择唯一的行:

>combination1 <- unique(t(apply(combination, 1, sort)))
>combination1
     [,1]  [,2]   
[1,] "cat" "dog" 
[2,] "dog" "dog"  
[3,] "cow" "dog"  
[4,] "cat" "cow"  
[5,] "cow" "cow"  
[6,] "cat" "sheep"
[7,] "dog" "sheep"
[8,] "cow" "sheep"

#1


1  

You could try sorting the rows, then taking the unique ones:

您可以尝试对行进行排序,然后选择唯一的行:

>combination1 <- unique(t(apply(combination, 1, sort)))
>combination1
     [,1]  [,2]   
[1,] "cat" "dog" 
[2,] "dog" "dog"  
[3,] "cow" "dog"  
[4,] "cat" "cow"  
[5,] "cow" "cow"  
[6,] "cat" "sheep"
[7,] "dog" "sheep"
[8,] "cow" "sheep"