xgboost入门以及windows下安装及使用二

时间:2022-09-09 07:28:51

如果看到上篇,xgboost没有安装成功的可以提问我,本文主要讲xgboost的测试例子,例子模仿别人的,但补充了很多,希望能帮到更多人!

import sys,os
sys.path.append('E:\\xgboost-master\\xgboost-master\\wrapper')

import numpy as np
import scipy.sparse
import xgboost as xgb

dtrain = xgb.DMatrix('E:\\my-train.txt');
dtest = xgb.DMatrix('E:\\my-test.txt');

param = {'max_depth':6, 'eta':0.3, 'silent':1, 'objective':'binary:logistic'}

watchlist = [(dtest,'eval'), (dtrain,'train')]
num_round = 20
bst = xgb.train(param, dtrain, num_round, watchlist)

# this is prediction
preds = bst.predict(dtest)
labels = dtest.get_label()
print ('error=%f' % ( sum(1 for i in range(len(preds)) if int(preds[i]>0.5)!=labels[i]) /float(len(preds))))
bst.save_model('C:\\xgb.model')

我把xgboost放到E盘,简单测试了两个文件:

my-train.txt

1 1:1 2:1 3:1 4:1
1 1:2 2:2 3:2 4:2
1 1:3 2:3 3:3 4:3
1 1:4 2:4 3:4 4:4
1 1:5 2:5 3:5 4:5
1 1:6 2:6 3:6 4:6
0 1:62 2:32 3:24 4:26
0 1:39 2:73 3:93 4:35
0 1:41 2:43 3:42 4:43
0 1:5 2:35 3:52 4:53
0 1:64 2:16 3:46 4:36

my-test.txt

1 1:5 2:5 3:5 4:5
1 1:6 2:6 3:6 4:6
0 1:62 2:32 3:24 4:26
0 1:39 2:73 3:93 4:35
0 1:41 2:43 3:42 4:43

运行结果:

Error when loading sklearn/plotting. Please install scikit-learn
error=0.000000
11x5 matrix with 44 entries is loaded from E:\my-train.txt
5x5 matrix with 20 entries is loaded from E:\my-test.txt
[0] eval-error:0.000000train-error:0.000000
[1] eval-error:0.000000train-error:0.000000
[2] eval-error:0.000000train-error:0.000000
[3] eval-error:0.000000train-error:0.000000
[4] eval-error:0.000000train-error:0.000000
[5] eval-error:0.000000train-error:0.000000
[6] eval-error:0.000000train-error:0.000000
[7] eval-error:0.000000train-error:0.000000
[8] eval-error:0.000000train-error:0.000000
[9] eval-error:0.000000train-error:0.000000
[10] eval-error:0.000000train-error:0.000000
[11] eval-error:0.000000train-error:0.000000
[12] eval-error:0.000000train-error:0.000000
[13] eval-error:0.000000train-error:0.000000
[14] eval-error:0.000000train-error:0.000000
[15] eval-error:0.000000train-error:0.000000
[16] eval-error:0.000000train-error:0.000000
[17] eval-error:0.000000train-error:0.000000
[18] eval-error:0.000000train-error:0.000000
[19] eval-error:0.000000train-error:0.000000


预测代码,先加载模型和数据,然后进行预测。

#! /usr/bin/env python
#coding=utf-8
import sys,os
sys.path.append('E:\\xgboost-master\\xgboost-master\\wrapper')

import numpy as np
import scipy.sparse
import xgboost as xgb

dtest2 = xgb.DMatrix('E:\\my-test2.txt')

bst2 = xgb.Booster(model_file='C:\\xgb.model')
preds2 = bst2.predict(dtest2)
print preds2
# this is prediction
outing = open('C:\\Result.txt', 'w')
outing.write(str(int(preds2[0]>0.5))) #只输出了一个
outing.close()

my-test2:

1 1:15 2:15 3:15 4:15
1 1:6 2:6 3:6 4:6
1 1:16 2:16 3:16 4:16
0 1:62 2:32 3:24 4:26
0 1:39 2:73 3:93 4:35
0 1:411 2:43 3:42 4:43

输出结果:

[ 0.26937994  0.77472818  0.26937994  0.26937994  0.26937994  0.26937994]
6x5 matrix with 24 entries is loaded from E:\my-test2.txt