监督学习笔记 - Python

TOP

监督学习笔记(四)

2017-12-24 06:07:04 【大中小】浏览:981次

ean_absolute_error(Y_test,y_test_pred_ridge))) print('均方误差：',round(sm.mean_squared_error(Y_test,y_test_pred_ridge))) print('中位数绝对误差：',round(sm.median_absolute_error(Y_test,y_test_pred_ridge))) print('解释方差分：',round(sm.explained_variance_score(Y_test,y_test_pred_ridge))) print('R方得分：',round(sm.r2_score(Y_test,y_test_pred_ridge)))

7、创建多项式回归器

若数据中带有曲线，线性模型不能捕捉到，因为线性回归模型只能拟合直线；所以这里可通过拟合多项式方程来克服这类问题，提高模型的准确性，但是随着曲线率增加，使得拟合速度变慢，所以曲线率的大小需综合考量。

# 创建多项式回归器
quadratic_featurizer = preprocessing.PolynomialFeatures(degree=5)#获取多项式对象，degree的值越大，模型的越准确
X_train_quadratic = quadratic_featurizer.fit_transform(X_training)#获得多项式形式的输入
xx=np.linspace(-6,4,100)#曲线显示
regressor_quadratic = linear_model.LinearRegression()
regressor_quadratic.fit(X_train_quadratic, Y_training)
xx_quadratic = quadratic_featurizer.fit_transform(xx.reshape(xx.shape[0], 1))#获得多项式形式的输入
yy_pre=regressor_quadratic.predict(xx_quadratic)#获取预测值

plot.figure()
plot.scatter(X_training,Y_training,color='green')
plot.plot(xx,yy_pre , 'r-')
plot.title('train数据显示')
plot.show()

8、例子

1.估算房屋价格

使用带AdaBoost算法的决策树回归器（decision tree regressor）。决策树是一个树装模型，每一个节点都做出一个决策，从而影响最终结果，叶子节点表示输出数值，分支表示根据输入特征做出的中间决策。AdaBoost算法是指自适应增强（adaptive boosting）算法，这是一种利用其它系统增强模型准确性的技术。这种技术是将不同版本的算法结果进行组合，用加权汇总的方式获得最终结果，被称为弱学习器*(weak learners).使用详细例子如下：

# 估算房屋价格
import numpy as np
from sklearn import datasets
from sklearn.utils import shuffle
from sklearn.tree import DecisionTreeRegressor
from sklearn.ensemble import AdaBoostRegressor
from sklearn.metrics import mean_squared_error,explained_variance_score
import matplotlib.pyplot as plt

# 1、获取标准房屋价格数据库，scikit-learn提供接口
housing_data=datasets.load_boston()
# 2、将数据分入到X，Y中,并通过shuffle打乱数据，random_state控制如何打乱顺序
X,y=shuffle(housing_data.data,housing_data.target,random_state=7)
# 3、80%训练数据，20%测试数据
num_training=int(0.8*len(X))
X_train,y_train=X[0:num_training],y[0:num_training]
X_test,y_test=X[num_training:],y[num_training:]
# 4、拟合决策树模型，并限制最大深度为4
dt_regressor=DecisionTreeRegressor(max_depth=4)
dt_regressor.fit(X_train,y_train)
# 5、用带AdaBoost算法
ab_regressor=AdaBoostRegressor(DecisionTreeRegressor(max_depth=4),n_estimators=400,random_state=4)
ab_regressor.fit(X_train,y_train)
# 6、评估决策树模型测试结果，尽量保证均方误差最低，而解释方差分最高
y_dt_pred=dt_regressor.predict(X_test)
print('均方差：',mean_squared_error(y_test,y_dt_pred))
print('解释方差：',explained_variance_score(y_test,y_dt_pred))
# 7、评估AdaBoost测试结果，同上
y_ab_pred=ab_regressor.predict(X_test)
print('均方差：',mean_squared_error(y_test,y_ab_pred))
print('解释方差：',explained_variance_score(y_test,y_ab_pred))

def plot_feature_importances(feature_importances,title,feature_names):
    feature_importances=100*(feature_importances/max(feature_importances))
    index_sorted=np.flipud(np.argsort(feature_importances))#argsort获得数值从小到大排序的索引，flipud反序
    pos=np.arange(index_sorted.shape[0])+0.5

    plt.figure()
    plt.bar(pos,feature_importances[index_sorted],align='center')
    plt.xticks(pos,feature_names[index_sorted])
    plt.title(title)
    plt.show()

plot_feature_importances(dt_regressor.feature_importances_,'dt',housing_data.feature_names)
plot_feature_impo

首页上一页 1 2 3 4 下一页尾页 4/4/4
【大中小】【打印】【繁体】【投稿】【收藏】【推荐】【举报】【评论】【关闭】【返回顶部】

上一篇：Python2.7.14安装和pip配置安装及..	下一篇：python 初学之账户登录