【338】Pandas.DataFrame

Ref: Pandas Tutorial: DataFrames in Python

Ref: pandas.DataFrame

Ref: Pandas：DataFrame对象的基础操作

Ref: Creating, reading, and writing reference

pandas.DataFrame()
pandas.Series()
pandas.read_csv()
pandas.DataFrame.shape
pandas.DataFrame.head
pandas.read_excel()
pandas.to_csv()
pandas.to_excel()

Ref: Indexing, selecting, assigning reference

pandas.iloc(): 类似于Excel中的Cell函数，将其看做Matrix
pandas.loc()

一、基本概念

class pandas.DataFrame(data=None, index=None, columns=None, dtype=None, copy=False)

Parameters:	data : 数据主体部分，numpy ndarray (structured or homogeneous), dict, or DataFrame Dict can contain Series, arrays, constants, or list-like objects Changed in version 0.23.0: If data is a dict, argument order is maintained for Python 3.6 and later. index : 行名称，默认 0, 1, 2, ..., n, Index or array-like Index to use for resulting frame. Will default to RangeIndex if no indexing information part of input data and no index provided columns : 列名称，默认 0, 1, 2, ..., n, Index or array-like Column labels to use for resulting frame. Will default to RangeIndex (0, 1, 2, …, n) if no column labels are provided dtype : 数据类型，dtype, default None Data type to force. Only a single dtype is allowed. If None, infer copy : boolean, default False Copy data from inputs. Only affects DataFrame / 2d ndarray input

Parameters:

data : 数据主体部分，numpy ndarray (structured or homogeneous), dict, or DataFrame

Dict can contain Series, arrays, constants, or list-like objects

Changed in version 0.23.0: If data is a dict, argument order is maintained for Python 3.6 and later.

index : 行名称，默认 0, 1, 2, ..., n, Index or array-like

Index to use for resulting frame. Will default to RangeIndex if no indexing information part of input data and no index provided

columns : 列名称，默认 0, 1, 2, ..., n, Index or array-like

Column labels to use for resulting frame. Will default to RangeIndex (0, 1, 2, …, n) if no column labels are provided

dtype : 数据类型，dtype, default None

Data type to force. Only a single dtype is allowed. If None, infer

copy : boolean, default False

Copy data from inputs. Only affects DataFrame / 2d ndarray input

data[1:,0] means the first column, data[0,1:] means the first row.

>>> import numpy as np

>>> import pandas as pd

>>> data = np.array([

	['','Col1','Col2'],

	['Row1',1,2],

	['Row2',3,4]

	])

>>> print(pd.DataFrame(data=data[1:,1:],

		       index=data[1:,0],

		       columns=data[0,1:]))

     Col1 Col2

Row1    1    2

Row2    3    4

>>> data = np.array([

	[1,2],

	[3,4]])

>>> print(pd.DataFrame(data=data,

		       index=['Row1','Row2'],

		       columns=['Col1','Col2']))

      Col1  Col2

Row1     1     2

Row2     3     4

Ref: pandas dataframe.apply() 实现对某一行/列进行处理获得一个新行/新列

Ref: 在pandas中遍历DataFrame行

Ref: pandas.DataFrame.apply

二、相关方法：

DataFrame.apply(func, axis=0, broadcast=None, raw=False, reduce=None, result_type=None, args=(), **kwds)

Apply a funciton along an axis of the DataFrame. (类似Excel中对一列或者一行数据进行摸个函数的处理)

Objects passed to the function are Series objects whose index is either the DataFrame's index (axis=0) or the DataFrame's columns (axis=1).

Ref: pandas.Series.value_counts

Series.value_counts(normalize=False, sort=True, ascending=False, bins=None, dropna=True)

Returns object containing counts of unique values.

The resulting object will be in desceding order so that the first element is the most frequent-occurring element. Excludes NA values by default.

DataFrame.read_csv(): 可以将 Str 通过 StringIO() 转为文件缓存，可以直接用此方法

>>> from io import StringIO

>>> a = '''

A, B, C

1,2,3

4,5,6

7,8,9

'''

>>> a

'\nA, B, C\n1,2,3\n4,5,6\n7,8,9\n'

>>> data = pd.read_csv(StringIO(a))

>>> data

   A   B   C

0  1   2   3

1  4   5   6

2  7   8   9

秒客网

【338】Pandas.DataFrame

一、基本概念

二、相关方法：

相关文章