统计处理包Statsmodels: statistics in python

时间:2023-03-08 16:05:49

http://blog.****.net/pipisorry/article/details/52227580

Statsmodels

Statsmodels is a Python package that provides a complement to scipy for statistical computations including descriptive statistics and estimation of statistical models.

statsmodels原名叫scikits.statsmodels,已经改成了statsmodels。

其中也有时间序列分析的模块[ Time Series analysis tsa]

statsmodels主要功能features

regression: Generalized least squares (including weighted least squares and least squares with autoregressive errors), ordinary least squares.
    glm: Generalized linear models with support for all of the one-parameter exponential family distributions.
    discrete choice models: Poisson, probit, logit, multinomial logit
    rlm: Robust linear models with support for several M-estimators.
    tsa: Time series analysis models, including ARMA, AR, VAR
    nonparametric : (Univariate) kernel density estimators
    datasets: Datasets to be distributed and used for examples and in testing.
    PyDTA: Tools for reading Stata .dta files into numpy arrays.
    stats: a wide range of statistical tests
    sandbox: There is also a sandbox which contains code for generalized additive models (untested), mixed effects models, cox proportional hazards model (both are untested and still dependent on the nipy formula framework), generating descriptive statistics, and printing table output to ascii, latex, and html. There is also experimental code for systems of equations regression, time series models, panel data estimators and information theoretic measures. None of this code is considered "production ready".

安装

pip install statsmodels

需要的依赖有:

Python >= 2.6, including Python 3.x

NumPy >= 1.5.1

SciPy >= 0.9.0

Pandas >= 0.7.1

Patsy >= 0.3.0

Cython >= 20.1, Needed if you want to build the code from github and not a source distribution. You must use Cython >= 0.20.1 if you’re on Python 3.4. Earlier versions may work for Python < 3.4.

如果安装不上就源码安装:

git clone git://github.com/statsmodels/statsmodels.git

cd statsmodels

python setup.py install

virtualenv中安装出错

error: Command "x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security -D_FORTIFY_SOURCE=2 -fPIC -I/home/piting/ENV/ubuntu_env/lib/python3.4/site-packages/numpy/core/include -I/usr/include/python3.4m -I/home/piting/ENV/ubuntu_env/include/python3.4m -c statsmodels/nonparametric/_smoothers_lowess.c -o build/temp.linux-x86_64-3.4/statsmodels/nonparametric/_smoothers_lowess.o" failed with exit status 1

需要安装sudo apt-get install python3-dev[python-dev]

python-dev: Header files and a static library for Python

[Installation]

皮皮blog

statsmodels的使用

熵计算模块

如renyi熵的计算

from statsmodels.sandbox.infotheo import renyientropy
renyientropy(ij[np.nonzero(ij)] / sum(ij), alpha=q, logbase=math.e)

from: http://blog.****.net/pipisorry/article/details/52227580

ref: [statsmodels/statsmodels github]

[homepage Statsmodels]

[Statsmodels’s Documentation]

[statsmodels源码]