Custom Transformer Mixin With Featureunion In Scikit-learn
I am writing custom transformers in scikit-learn in order to do specific operations on the array. For that I use inheritance of class TransformerMixin. It works fine when I deal on
Solution 1:
FeatureUnion
will just concatenate what its getting from internal transformers. Now in your internal transformers, you are sending same columns from each one. Its upon the transformers to correctly send the correct data forward.
I would advise you to just return the new data from the internal transformers, and then concatenate the remaining columns either from outside or inside the FeatureUnion
.
Look at this example if you havent already:
You can do this for example:
# This dont do anything, just pass the data as it isclassDataPasser(TransformerMixin):
deffit(self, X, y=None):
return self
deftransform(self, X):
return X
# Your transformerclassDummyTransformer(TransformerMixin):
def__init__(self, value=None):
TransformerMixin.__init__(self)
self.value = value
deffit(self, *_):
return self
# Changed this to only return new column after some operation on Xdeftransform(self, X):
s = np.full(X.shape[0], self.value)
return s.reshape(-1,1)
After that, further down in your code, change this:
stages = []
# Append our DataPasser here, so original data is at the beginning
stages.append(('no_change', DataPasser()))
for i inrange(2):
transfo = DummyTransformer(value=i+1)
stages.append(('step'+str(i+1),transfo))
pipeunion = FeatureUnion(stages)
Running this new code has the result:
('Given result of the Feature union pipeline: \n',
array([['foo', 'a', '1', '2'],
['bar', 'b', '1', '2'],
['baz', 'c', '1', '2']], dtype='|S21'), '\n')
('Expected result of the Feature Union pipeline: \n',
array([['foo', 'a', '1', '2'],
['bar', 'b', '1', '2'],
['baz', 'c', '1', '2']], dtype='|S21'), '\n')
Post a Comment for "Custom Transformer Mixin With Featureunion In Scikit-learn"