如何在sklearn中将交叉验证与自定义估算器结合使用?
我用 fit
和 transform
方法编写了一个自定义估算器类.我能够创建模型,使用模型进行训练和预测.
I have written a custom estimator class with a fit
and transform
method. I am able to create a model, train and predict using the model.
但是,在进行交叉验证时,我遇到了以下错误: TypeError:无法深度复制此模式对象
.
However, while doing cross-validation, I run into this error: TypeError: cannot deepcopy this pattern object
.
CustomEstimator
的外观如下:
class DefaultEstimator(BaseEstimator, TransformerMixin):
def __init__(self, preprocessor, pipelines):
self.preprocessor = preprocessor
self.pipelines = pipelines
def fit(self, X, y=None):
for each_pipeline in self.pipelines:
each_pipeline.fit(self.preprocessor.apply(X), y)
return self
def transform(self, X):
transformed_data = []
for each_pipeline in self.pipelines:
transformed_data.append(each_pipeline.transform(self.preprocessor.apply(X)))
return sp.hstack(transformed_data)
有人有解决此问题的想法吗?
Does anyone have an idea on approaching this issue?
正如一些评论所建议的,此错误是因为 self.processor
无法深度克隆.
As suggested in few comments, this error is because self.processor
can't be deep-cloned.
因此,此错误的解决方法是从此类中删除预处理步骤,并将其作为独立的预处理步骤或在管道本身内部移动.
So, the workaround for this error is to remove preprocessing step from this class and move it as independent preprocessing step or inside the pipeline itself.