通过列连接pandas数据帧并用'NaN'填充空白

时间:2021-04-18 22:57:25

I have four pandas dataframes that can be generated with the below code:

我有四个可以使用以下代码生成的pandas数据帧:

#df 1
time1=pandas.Series([0,20,40,60,120])
pPAK2=pandas.Series([0,3,15,21,23])
cols=['time','pPAK2']

df=pandas.DataFrame([time1,pPAK2])
df=df.transpose()
df.columns=cols
df.to_csv('pPAK2.csv',sep='\t')
pak2_df=df

#df2
time2=pandas.Series([0,15,30,60,120])
cAbl=pandas.Series([0,15,34,10,0])
df=pandas.DataFrame([time2,cAbl])
df=df.transpose()
cols=['time','pcAbl']
df.columns=cols
df.to_csv('pcAbl.csv',sep='\t')
pcAbl_df=df

#df 3
time7=pandas.Series([0,60,120,240,480,960,1440])
pSmad3_n=pandas.Series([0,16,14,12,8,7.5,6])
scale_factor=40
pSmad3_n=pSmad3_n*scale_factor

#plt.plot(time7,pSmad3)
df=pandas.DataFrame([time7,pSmad3_n])
df=df.transpose()
cols=['time','pSmad3_n']
df.columns=cols
df.to_csv('pSmad3_n.csv',sep='\t')

smad3_df=df

#df4
time8=pandas.Series([0,240,480,1440])
PAI1_mRNA=pandas.Series([0,23,25,5])
scale_factor=5
PAI1_mRNA=PAI1_mRNA*scale_factor
df=pandas.DataFrame([time8,PAI1_mRNA])
df=df.transpose()
cols=['time','PAI1_mRNA']
df.columns=cols
df.to_csv('PAI1_mRNA.csv',sep='\t')
PAI1_df=df


#print dataframes
print PAI1_df
print pak2_df
print pcAbl_df
print smad3_df

I want to concatenate these dataframes by the time column with the pandas concat function but I can't get the output right. The output should look something like this, if were to just concatenate PAI1_df and pak2_df

我想通过时间列连接这些数据帧与pandas concat函数,但我无法正确输出。输出应该看起来像这样,如果只是连接PAI1_df和pak2_df

   time  PAI1_mRNA    pPAK2
0     0          0    0
1    20    'NaN'      3
2    40    'NaN'      15
3    60     'NaN'     21
4    120    'NaN'     23
5   240        115    'NaN'
6   480        125    'NaN'
7  1440         25    'NaN

I think it should be easy but there are a lot of features in the doc, does anybody know how to do this?

我认为应该很容易但是文档中有很多功能,有人知道怎么做吗?

1 个解决方案

#1


2  

So you can concat it like this:

所以你可以像这样结束它:

import pandas

df = pandas.concat([pak2_df.set_index('time'), pcAbl_df.set_index('time')], axis=1).reset_index()
print(df)

Prints:

   time  pPAK2  pcAbl
0     0      0      0
1    15    NaN     15
2    20      3    NaN
3    30    NaN     34
4    40     15    NaN
5    60     21     10
6   120     23      0

#1


2  

So you can concat it like this:

所以你可以像这样结束它:

import pandas

df = pandas.concat([pak2_df.set_index('time'), pcAbl_df.set_index('time')], axis=1).reset_index()
print(df)

Prints:

   time  pPAK2  pcAbl
0     0      0      0
1    15    NaN     15
2    20      3    NaN
3    30    NaN     34
4    40     15    NaN
5    60     21     10
6   120     23      0