【Python学习】【scikit-learn】Pipeline报错fit_transform() takes 2 positional arguments but 3 were given

时间:2021-12-04 04:44:07

最近在读 Hands-On Machine Learning with Scikit-Learn & TensorFlow 这本书,在学到pipeline的时候,我模仿者写了这也的代码:

[python]  view plain  copy
  1. num_attribs=list(housing_numerical)  
  2. cat_attribs=["ocean_proximity"]  
  3.   
  4. num_pipeline=Pipeline([  
  5.     ("selector",DataFrameSelector(num_attribs)),  
  6.     ("imputer",Imputer(strategy="median")),  
  7.     ("attribs_adder",CombinedAttributesAdder()),  
  8.     ("std_scaler",StandardScaler()),  
  9. ])  
  10.   
  11. cat_pipeline=Pipeline([  
  12.     ("selector",DataFrameSelector(cat_attribs)),  
  13.     <span style="color:#ff0000;">("label_binarizer",LabelBinarizer())</span>,  
  14. ])  
  15.   
  16. full_pipeline=FeatureUnion(transformer_list=[  
  17.     ("num_pipeline",num_pipeline),  
  18.     ("cat_pipeline",cat_pipeline),  
  19. ])  

但是会报错如下:

[plain]  view plain  copy
  1. TypeError: fit_transform() takes 2 positional arguments but 3 were given  

我想,这应该是版本更新引起的问题,果然我在这里找到了答案。以下为引用:

The pipeline is assuming LabelBinarizer's fit_transform method is defined to take three positional arguments:

def fit_transform(self, x, y)
    ...rest of the code

while it is defined to take only two:

def fit_transform(self, x):
    ...rest of the code

所以,解决方法就是,自己写一个根据LabelBinarizer写一个MyLabelBinarizer,可以有三个参数self,X,y=None.

from sklearn.base import TransformerMixin #gives fit_transform method for free
class MyLabelBinarizer(TransformerMixin):
    def __init__(self, *args, **kwargs):
        self.encoder = LabelBinarizer(*args, **kwargs)
    def fit(self, x, y=0):
        self.encoder.fit(x)
        return self
    def transform(self, x, y=0):
        return self.encoder.transform(x)