sklearn 中的网格搜索交叉验证
问题描述:
可以使用网格搜索交叉验证来提取决策树分类器的最佳参数吗?http://scikit-learn.org/stable/tutorial/statistical_inference/model_selection.html
Can grid-search-cross-validation be used to extract best parameters with Decision Tree classifier? http://scikit-learn.org/stable/tutorial/statistical_inference/model_selection.html
答
为什么不?
我邀请您查看 GridsearchCV.
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import roc_auc_score
param_grid = {'max_depth': np.arange(3, 10)}
tree = GridSearchCV(DecisionTreeClassifier(), param_grid)
tree.fit(xtrain, ytrain)
tree_preds = tree.predict_proba(xtest)[:, 1]
tree_performance = roc_auc_score(ytest, tree_preds)
print 'DecisionTree: Area under the ROC curve = {}'.format(tree_performance)
并提取最佳参数:
tree.best_params_
Out[1]: {'max_depth': 5}