【452】pandas筛选出表中满足另一个表所有条件的数据

时间:2023-03-09 15:20:53
【452】pandas筛选出表中满足另一个表所有条件的数据

参考:pandas筛选出表中满足另一个表所有条件的数据

参考:pandas:匹配两个dataframe

使用 pd.merge 来实现

on 表示查询的 columns,如果都有 id,那么这是很好的区别项,找到 id 相同的进行merge。

>>> import numpy as np

>>> import pandas as pd

>>> data1 = {
'one': pd.Series([1,2,3]),
'two': pd.Series([11,22,33])
} >>> df1 = pd.DataFrame(data = data1) >>> df1 one two
0 1 11
1 2 22
2 3 33
>>> data2 = {
'one': pd.Series([1,2,3,4,5,6]),
'two': pd.Series([11,22,33]),
'three': pd.Series([111,222,333]),
'four': pd.Series([1111,2222,3333,4444,5555,6666])
} >>> df2 = pd.DataFrame(data = data2) >>> df2 one two three four
0 1 11.0 111.0 1111
1 2 22.0 222.0 2222
2 3 33.0 333.0 3333
3 4 NaN NaN 4444
4 5 NaN NaN 5555
5 6 NaN NaN 6666
>>> df2[df2['one']<3] one two three four
0 1 11.0 111.0 1111
1 2 22.0 222.0 2222 >>> df = pd.merge(df1, df2, how='inner') >>> df one two three four
0 1 11 111.0 1111
1 2 22 222.0 2222
2 3 33 333.0 3333
>>> df1 one two
0 1 11
1 2 22
2 3 33
>>> df2 one two three four
0 1 11.0 111.0 1111
1 2 22.0 222.0 2222
2 3 33.0 333.0 3333
3 4 NaN NaN 4444
4 5 NaN NaN 5555
5 6 NaN NaN 6666
>>> pd.merge(df1, df2, how='inner') one two three four
0 1 11 111.0 1111
1 2 22 222.0 2222
2 3 33 333.0 3333
>>> pd.merge(df2, df1, how='inner') one two three four
0 1 11.0 111.0 1111
1 2 22.0 222.0 2222
2 3 33.0 333.0 3333
>>> five = pd.Series([1,2,3,4,5,6]) >>> df2['five'] = five >>> df2 one two three four five
0 1 11.0 111.0 1111 1
1 2 22.0 222.0 2222 2
2 3 33.0 333.0 3333 3
3 4 NaN NaN 4444 4
4 5 NaN NaN 5555 5
5 6 NaN NaN 6666 6
>>> df1 one two
0 1 11
1 2 22
2 3 33
>>> pd.merge(df2, df1, how='inner') one two three four five
0 1 11.0 111.0 1111 1
1 2 22.0 222.0 2222 2
2 3 33.0 333.0 3333 3
>>> pd.merge(df1, df2, how='inner') one two three four five
0 1 11 111.0 1111 1
1 2 22 222.0 2222 2
2 3 33 333.0 3333 3
>>> df1 one two
0 1 11
1 2 22
2 3 33
>>> df2 one two three four five
0 1 11.0 111.0 1111 1
1 2 22.0 222.0 2222 2
2 3 33.0 333.0 3333 3
3 4 NaN NaN 4444 4
4 5 NaN NaN 5555 5
5 6 NaN NaN 6666 6
>>> six = pd.Series([-1, -2, -3]) >>> df1['six'] = six >>> df1 one two six
0 1 11 -1
1 2 22 -2
2 3 33 -3
>>> df2 one two three four five
0 1 11.0 111.0 1111 1
1 2 22.0 222.0 2222 2
2 3 33.0 333.0 3333 3
3 4 NaN NaN 4444 4
4 5 NaN NaN 5555 5
5 6 NaN NaN 6666 6
>>> pd.merge(df1, df2, how='inner') one two six three four five
0 1 11 -1 111.0 1111 1
1 2 22 -2 222.0 2222 2
2 3 33 -3 333.0 3333 3
>>> pd.merge(df2, df1, how='inner') one two three four five six
0 1 11.0 111.0 1111 1 -1
1 2 22.0 222.0 2222 2 -2
2 3 33.0 333.0 3333 3 -3