熊猫dataframe concat提供不需要的NA/NaN专栏

时间:2023-02-02 22:55:33

Instead of this example where it is horizontal After Pandas Dataframe pd.concat I get NaNs, I'm trying vertical:

而不是这个例子,它在熊猫数据爆炸后是水平的。concat我得到了NaNs,我尝试垂直:

import pandasa=[['Date', 'letters', 'numbers', 'mixed'], ['1/2/2014', 'a', '6', 'z1'], ['1/2/2014', 'a', '3', 'z1'], ['1/3/2014', 'c', '1', 'x3']]df = pandas.DataFrame.from_records(a[1:],columns=a[0])f=[]for i in range(0,len(df)):    f.append(df['Date'][i] + ' ' + df['letters'][i])df['new']=fc=[x for x in range(0,5)]b=[]b += [['NA'] * (5 - len(b))]df_a = pandas.DataFrame.from_records(b,columns=c)df_b=pandas.concat([df,df_a], ignore_index=True)

df_b outputs same as df_b=pandas.concat([df,df_a], axis=0)

df_b输出与df_b=熊猫相同。concat([df,df_a],轴= 0)

result:

结果:

     0    1    2    3    4      Date letters mixed         new numbers0  NaN  NaN  NaN  NaN  NaN  1/2/2014       a    z1  1/2/2014 a       61  NaN  NaN  NaN  NaN  NaN  1/2/2014       a    z1  1/2/2014 a       32  NaN  NaN  NaN  NaN  NaN  1/3/2014       c    x3  1/3/2014 c       10   NA   NA   NA   NA   NA       NaN     NaN   NaN         NaN     NaN

desired:

期望:

       Date letters numbers mixed         new0  1/2/2014       a       6    z1  1/2/2014 a1  1/2/2014       a       3    z1  1/2/2014 a2  1/3/2014       c       1    x3  1/3/2014 c0  NA             NA      NA   NA  NA

2 个解决方案

#1


2  

I would create a dataframe df_a with the correct columns directly.

我将直接使用正确的列创建一个dataframe df_a。

With a little refactoring of your code, it gives

通过对代码进行简单的重构,它给出了。

import pandasa=[['Date', 'letters', 'numbers', 'mixed'], \   ['1/2/2014', 'a', '6', 'z1'],\   ['1/2/2014', 'a', '3', 'z1'],\   ['1/3/2014', 'c', '1', 'x3']]df = pandas.DataFrame.from_records(a[1:],columns=a[0])df['new'] = df['Date'] + ' ' + df['letters']n = len(df.columns)b = [['NA'] * n]df_a = pandas.DataFrame.from_records(b,columns=df.columns)df_b = pandas.concat([df,df_a])

It gives

它给

       Date letters numbers mixed         new0  1/2/2014       a       6    z1  1/2/2014 a1  1/2/2014       a       3    z1  1/2/2014 a2  1/3/2014       c       1    x3  1/3/2014 c0        NA      NA      NA    NA          NA

Eventually:

最终:

df_b = pandas.concat([df,df_a]).reset_index(drop=True)

It gives

它给

       Date letters numbers mixed         new0  1/2/2014       a       6    z1  1/2/2014 a1  1/2/2014       a       3    z1  1/2/2014 a2  1/3/2014       c       1    x3  1/3/2014 c3        NA      NA      NA    NA          NA

#2


1  

If you are using latest versions, this gives you what you want

如果您正在使用最新版本,这将提供您所需的内容

df.ix[len(df), :]='NA'

EDIT:OR if you want concat, when you define df_a, use columns of df as columns

编辑:或者如果您想要concat,在定义df_a时,使用df的列作为列

df_a = pandas.DataFrame.from_records(b,columns=df.columns)

#1


2  

I would create a dataframe df_a with the correct columns directly.

我将直接使用正确的列创建一个dataframe df_a。

With a little refactoring of your code, it gives

通过对代码进行简单的重构,它给出了。

import pandasa=[['Date', 'letters', 'numbers', 'mixed'], \   ['1/2/2014', 'a', '6', 'z1'],\   ['1/2/2014', 'a', '3', 'z1'],\   ['1/3/2014', 'c', '1', 'x3']]df = pandas.DataFrame.from_records(a[1:],columns=a[0])df['new'] = df['Date'] + ' ' + df['letters']n = len(df.columns)b = [['NA'] * n]df_a = pandas.DataFrame.from_records(b,columns=df.columns)df_b = pandas.concat([df,df_a])

It gives

它给

       Date letters numbers mixed         new0  1/2/2014       a       6    z1  1/2/2014 a1  1/2/2014       a       3    z1  1/2/2014 a2  1/3/2014       c       1    x3  1/3/2014 c0        NA      NA      NA    NA          NA

Eventually:

最终:

df_b = pandas.concat([df,df_a]).reset_index(drop=True)

It gives

它给

       Date letters numbers mixed         new0  1/2/2014       a       6    z1  1/2/2014 a1  1/2/2014       a       3    z1  1/2/2014 a2  1/3/2014       c       1    x3  1/3/2014 c3        NA      NA      NA    NA          NA

#2


1  

If you are using latest versions, this gives you what you want

如果您正在使用最新版本,这将提供您所需的内容

df.ix[len(df), :]='NA'

EDIT:OR if you want concat, when you define df_a, use columns of df as columns

编辑:或者如果您想要concat,在定义df_a时,使用df的列作为列

df_a = pandas.DataFrame.from_records(b,columns=df.columns)