
时间:2022-11-11 12:42:32

I have two data frames df1 and df2. They have the same (two) columns. I want to remove the rows from df1 that are in df2.


2 个解决方案



You can do that with several packages. But here's how to do it with base R.


df1 <-matrix(1:6,ncol=2,byrow=TRUE)
df2 <-matrix(1:10,ncol=2,byrow=TRUE)
all <-rbind(df1,df2) #rbind the columns
#use !duplicated fromLast = FALSE and fromLast = TRUE to get unique rows.
all[!duplicated(all,fromLast = FALSE)&!duplicated(all,fromLast = TRUE),] 

     [,1] [,2]
[1,]    7    8
[2,]    9   10



Try this:

df2 <-matrix(1:6,ncol=2,byrow=TRUE)
df1 <-matrix(1:10,ncol=2,byrow=TRUE)

data.frame(v1=setdiff(df1[,1], df2[,1]), v2=setdiff(df1[,2], df2[,2]))
  v1 v2
1  7  8
2  9 10

Note that df1 and df2 are the same as Lapointe's but in the other way around, because you want to remove the rows from df1 that are in df2, so setdiff removes elements from x that are contained in y. See ?setdiff


you'll get the same result as Lapointe's




You can do that with several packages. But here's how to do it with base R.


df1 <-matrix(1:6,ncol=2,byrow=TRUE)
df2 <-matrix(1:10,ncol=2,byrow=TRUE)
all <-rbind(df1,df2) #rbind the columns
#use !duplicated fromLast = FALSE and fromLast = TRUE to get unique rows.
all[!duplicated(all,fromLast = FALSE)&!duplicated(all,fromLast = TRUE),] 

     [,1] [,2]
[1,]    7    8
[2,]    9   10



Try this:

df2 <-matrix(1:6,ncol=2,byrow=TRUE)
df1 <-matrix(1:10,ncol=2,byrow=TRUE)

data.frame(v1=setdiff(df1[,1], df2[,1]), v2=setdiff(df1[,2], df2[,2]))
  v1 v2
1  7  8
2  9 10

Note that df1 and df2 are the same as Lapointe's but in the other way around, because you want to remove the rows from df1 that are in df2, so setdiff removes elements from x that are contained in y. See ?setdiff


you'll get the same result as Lapointe's
