Let's assume that I have the following dataframe in pandas:
我们假设我在pandas中有以下数据帧:
AA BB CC
date
05/03 1 2 3
06/03 4 5 6
07/03 7 8 9
08/03 5 7 1
and I want to transform it to the following:
我想将其转换为以下内容:
AA 05/03 1
AA 06/03 4
AA 07/03 7
AA 08/03 5
BB 05/03 2
BB 06/03 5
BB 07/03 8
BB 08/03 7
CC 05/03 3
CC 06/03 6
CC 07/03 9
CC 08/03 1
How can I do it?
我该怎么做?
The reason of the transformation from wide to long is that, in the next stage, I would like to merge this dataframe with another one, based on dates and the initial column names (AA, BB, CC).
从宽到长转换的原因是,在下一阶段,我想根据日期和初始列名称(AA,BB,CC)将此数据帧与另一个数据帧合并。
2 个解决方案
#1
15
unstack
returns a series with a multiindex:
unstack返回一个带有multiindex的系列:
In [38]: df.unstack()
Out[38]:
date
AA 05/03 1
06/03 4
07/03 7
08/03 5
BB 05/03 2
06/03 5
07/03 8
08/03 7
CC 05/03 3
06/03 6
07/03 9
08/03 1
dtype: int64
You can call reset_index on the returning series:
你可以在返回的系列上调用reset_index:
In [39]: df.unstack().reset_index()
Out[39]:
level_0 date 0
0 AA 05-03 1
1 AA 06-03 4
2 AA 07-03 7
3 AA 08-03 5
4 BB 05-03 2
5 BB 06-03 5
6 BB 07-03 8
7 BB 08-03 7
8 CC 05-03 3
9 CC 06-03 6
10 CC 07-03 9
11 CC 08-03 1
Or construct a dataframe with a multiindex:
或者使用multiindex构造数据框:
In [40]: pd.DataFrame(df.unstack())
Out[40]:
0
date
AA 05-03 1
06-03 4
07-03 7
08-03 5
BB 05-03 2
06-03 5
07-03 8
08-03 7
CC 05-03 3
06-03 6
07-03 9
08-03 1
#2
1
Use pandas.melt to transform from wide to long:
使用pandas.melt从wide变为long:
df = pd.DataFrame({
'date' : ['05/03', '06/03', '07/03', '08/03'],
'AA' : [1, 4, 7, 5],
'BB' : [2, 5, 8, 7],
'CC' : [3, 6, 9, 1]
})
df
then we have:
然后我们有:
date AA BB CC
0 05/03 1 2 3
1 06/03 4 5 6
2 07/03 7 8 9
3 08/03 5 7 1
now set the index so that we have exactly the same df as yours:
现在设置索引,以便我们与您的df完全相同:
df = df.set_index('date')
df
gives us:
给我们:
AA BB CC
date
05/03 1 2 3
06/03 4 5 6
07/03 7 8 9
08/03 5 7 1
To convert, we just need to reset the index and then melt:
要转换,我们只需要重置索引然后融化:
df = df.reset_index()
pd.melt(df, id_vars='date', value_vars=['AA', 'BB', 'CC'])
this is the final result:
这是最终结果:
date variable value
0 05/03 AA 1
1 06/03 AA 4
2 07/03 AA 7
3 08/03 AA 5
4 05/03 BB 2
5 06/03 BB 5
6 07/03 BB 8
7 08/03 BB 7
8 05/03 CC 3
9 06/03 CC 6
10 07/03 CC 9
11 08/03 CC 1
#1
15
unstack
returns a series with a multiindex:
unstack返回一个带有multiindex的系列:
In [38]: df.unstack()
Out[38]:
date
AA 05/03 1
06/03 4
07/03 7
08/03 5
BB 05/03 2
06/03 5
07/03 8
08/03 7
CC 05/03 3
06/03 6
07/03 9
08/03 1
dtype: int64
You can call reset_index on the returning series:
你可以在返回的系列上调用reset_index:
In [39]: df.unstack().reset_index()
Out[39]:
level_0 date 0
0 AA 05-03 1
1 AA 06-03 4
2 AA 07-03 7
3 AA 08-03 5
4 BB 05-03 2
5 BB 06-03 5
6 BB 07-03 8
7 BB 08-03 7
8 CC 05-03 3
9 CC 06-03 6
10 CC 07-03 9
11 CC 08-03 1
Or construct a dataframe with a multiindex:
或者使用multiindex构造数据框:
In [40]: pd.DataFrame(df.unstack())
Out[40]:
0
date
AA 05-03 1
06-03 4
07-03 7
08-03 5
BB 05-03 2
06-03 5
07-03 8
08-03 7
CC 05-03 3
06-03 6
07-03 9
08-03 1
#2
1
Use pandas.melt to transform from wide to long:
使用pandas.melt从wide变为long:
df = pd.DataFrame({
'date' : ['05/03', '06/03', '07/03', '08/03'],
'AA' : [1, 4, 7, 5],
'BB' : [2, 5, 8, 7],
'CC' : [3, 6, 9, 1]
})
df
then we have:
然后我们有:
date AA BB CC
0 05/03 1 2 3
1 06/03 4 5 6
2 07/03 7 8 9
3 08/03 5 7 1
now set the index so that we have exactly the same df as yours:
现在设置索引,以便我们与您的df完全相同:
df = df.set_index('date')
df
gives us:
给我们:
AA BB CC
date
05/03 1 2 3
06/03 4 5 6
07/03 7 8 9
08/03 5 7 1
To convert, we just need to reset the index and then melt:
要转换,我们只需要重置索引然后融化:
df = df.reset_index()
pd.melt(df, id_vars='date', value_vars=['AA', 'BB', 'CC'])
this is the final result:
这是最终结果:
date variable value
0 05/03 AA 1
1 06/03 AA 4
2 07/03 AA 7
3 08/03 AA 5
4 05/03 BB 2
5 06/03 BB 5
6 07/03 BB 8
7 08/03 BB 7
8 05/03 CC 3
9 06/03 CC 6
10 07/03 CC 9
11 08/03 CC 1