从numpy数组创建pandas数据帧

时间:2022-12-06 21:15:49

To create a pandas dataframe from numpy I can use :

要从numpy创建一个pandas数据框,我可以使用:

columns = ['1','2']
data = np.array([[1,2] , [1,5] , [2,3]])
df_1 = pd.DataFrame(data,columns=columns)
df_1

If I instead use :

如果我改为使用:

columns = ['1','2']
data = np.array([[1,2,2] , [1,5,3]])
df_1 = pd.DataFrame(data,columns=columns)
df_1

Where each array is a column of data. But this throws error :

每个数组都是一列数据。但这会引发错误:

ValueError: Wrong number of items passed 3, placement implies 2

Is there support in pandas in this data format or must I use the format in example 1 ?

大熊猫是否支持这种数据格式,还是我必须使用示例1中的格式?

3 个解决方案

#1


2  

You need to transpose your numpy array:

你需要转置你的numpy数组:

df_1 = pd.DataFrame(data.T, columns=columns)

To see why this is necessary, consider the shape of your array:

要了解为什么这是必要的,请考虑数组的形状:

print(data.shape)

(2, 3)

The second number in the shape tuple, or the number of columns in the array, must be equal to the number of columns in your dataframe.

形状元组中的第二个数字或数组中的列数必须等于数据框中的列数。

When we transpose the array, the data and shape of the array are transposed, enabling it to be a passed into a dataframe with two columns:

当我们转置数组时,数组的数据和形状被转置,使它能够传递到具有两列的数据帧中:

print(data.T.shape)

(3, 2)

print(data.T)

[[1 1]
 [2 5]
 [2 3]]

#2


1  

DataFrames are inherently created in that order from an array.

DataFrames本质上是从数组中按顺序创建的。

Either way, you need to transpose something.

无论哪种方式,你需要转置一些东西。

One option would be to specify the index=columns then transpose the whole thing. This will get you the same output.

一种选择是指定index = columns然后转置整个事物。这将获得相同的输出。

 columns = ['1','2']
 data = np.array([[1,2,2] , [1,5,3]])
 df_1 = pd.DataFrame(data, index=columns).T
 df_1

Passing in data.T as mentioned above is also perfectly acceptable (assuming the data is an ndarray type).

传递data.T如上所述也是完全可以接受的(假设数据是ndarray类型)。

#3


0  

In the second case, you can use:

在第二种情况下,您可以使用:

df_1 = pd.DataFrame(dict(zip(columns, data)))

#1


2  

You need to transpose your numpy array:

你需要转置你的numpy数组:

df_1 = pd.DataFrame(data.T, columns=columns)

To see why this is necessary, consider the shape of your array:

要了解为什么这是必要的,请考虑数组的形状:

print(data.shape)

(2, 3)

The second number in the shape tuple, or the number of columns in the array, must be equal to the number of columns in your dataframe.

形状元组中的第二个数字或数组中的列数必须等于数据框中的列数。

When we transpose the array, the data and shape of the array are transposed, enabling it to be a passed into a dataframe with two columns:

当我们转置数组时,数组的数据和形状被转置,使它能够传递到具有两列的数据帧中:

print(data.T.shape)

(3, 2)

print(data.T)

[[1 1]
 [2 5]
 [2 3]]

#2


1  

DataFrames are inherently created in that order from an array.

DataFrames本质上是从数组中按顺序创建的。

Either way, you need to transpose something.

无论哪种方式,你需要转置一些东西。

One option would be to specify the index=columns then transpose the whole thing. This will get you the same output.

一种选择是指定index = columns然后转置整个事物。这将获得相同的输出。

 columns = ['1','2']
 data = np.array([[1,2,2] , [1,5,3]])
 df_1 = pd.DataFrame(data, index=columns).T
 df_1

Passing in data.T as mentioned above is also perfectly acceptable (assuming the data is an ndarray type).

传递data.T如上所述也是完全可以接受的(假设数据是ndarray类型)。

#3


0  

In the second case, you can use:

在第二种情况下,您可以使用:

df_1 = pd.DataFrame(dict(zip(columns, data)))