Pandas比较类似的DataFrames并得到Min

时间:2021-08-10 04:01:43

Given the following data frames:

给出以下数据框:

d1=pd.DataFrame({'A':[1,2,np.nan],'B':[np.nan,5,6]})
d1.index=['A','B','E']

    A       B
A   1.0     NaN
B   2.0     5.0
E   NaN     6.0

d2=pd.DataFrame({'A':[4,2,np.nan,4],'B':[4,2,np.nan,4]})
d2.index=['A','B','C','D']
    A       B
A   4.0     4.0
B   2.0     2.0
C   NaN     NaN
D   4.0     4.0

I'd like to compare them to find the lowest value in each corresponding row, while preserving all rows indices from both. Here is the result I'm looking for:

我想比较它们以找到每个相应行中的最低值,同时保留两个行中的所有行索引。这是我正在寻找的结果:

    A       B
A   1.0     4.0
B   2.0     2.0
C   NaN     NaN
D   4.0     4.0
E   NaN     6.0

Thanks in advance!

提前致谢!

3 个解决方案

#1


5  

Another option by aligning the two data frames (both index and columns) firstly, then use numpy.fmin:

通过首先对齐两个数据帧(索引和列)的另一个选项,然后使用numpy.fmin:

pd.np.fmin(*d1.align(d2))

Pandas比较类似的DataFrames并得到Min

Less convoluted:

不那么复杂:

d1, d2 = d1.align(d2)
pd.np.fmin(d1, d2)

#2


6  

You can concat the dfs and then use groupby to keep the min

您可以连接dfs,然后使用groupby保持最小值

df = pd.concat([d1,d2])
df = df.groupby(df.index).min()

You get

你得到

    A   B
A   1.0 4.0
B   2.0 2.0
C   NaN NaN
D   4.0 4.0
E   NaN 6.0

EDIT: More concise solutions from @root and @ScottBoston

编辑:来自@root和@ScottBoston的更简洁的解决方案

pd.concat([d1, d2]).groupby(level=0).min()

#3


4  

Use pd.Panel with min
Also note that this is generalizable to any number of dataframes.

使用带有min的pd.Panel还要注意,这可以推广到任意数量的数据帧。

pd.Panel(dict(enumerate([d1, d2]))).min(0)

     A    B
A  1.0  4.0
B  2.0  2.0
C  NaN  NaN
D  4.0  4.0
E  NaN  6.0

#1


5  

Another option by aligning the two data frames (both index and columns) firstly, then use numpy.fmin:

通过首先对齐两个数据帧(索引和列)的另一个选项,然后使用numpy.fmin:

pd.np.fmin(*d1.align(d2))

Pandas比较类似的DataFrames并得到Min

Less convoluted:

不那么复杂:

d1, d2 = d1.align(d2)
pd.np.fmin(d1, d2)

#2


6  

You can concat the dfs and then use groupby to keep the min

您可以连接dfs,然后使用groupby保持最小值

df = pd.concat([d1,d2])
df = df.groupby(df.index).min()

You get

你得到

    A   B
A   1.0 4.0
B   2.0 2.0
C   NaN NaN
D   4.0 4.0
E   NaN 6.0

EDIT: More concise solutions from @root and @ScottBoston

编辑:来自@root和@ScottBoston的更简洁的解决方案

pd.concat([d1, d2]).groupby(level=0).min()

#3


4  

Use pd.Panel with min
Also note that this is generalizable to any number of dataframes.

使用带有min的pd.Panel还要注意,这可以推广到任意数量的数据帧。

pd.Panel(dict(enumerate([d1, d2]))).min(0)

     A    B
A  1.0  4.0
B  2.0  2.0
C  NaN  NaN
D  4.0  4.0
E  NaN  6.0