This reviewer's identity has been verified by our review moderation team. They have asked not to show their name, job title, or picture.
sklearn provides consistent interface and the documentation is thorough. It is also highly extensible. Review collected by and hosted on G2.com.
I would prefer that cross_val_score provides a mechanism for out of sample evaluation. Assuming your sample is rebalanced, you may want the nth fold used for evaluation to be an unbalanced, out of sample dataset so as to the true performance of your model in the wild. cross_val_score does not provide this functionality. The pipeline class should also provide a mechanism to chain very many transformations and allow a grid search of best parameters across all the transformations. This is particularly useful in NLP pipeline where you stemming, removing stop words, ngram-ing, etc. could be a separate transformation and you want to know which transformation and parameters (e.g. the n in ngram) produced the best result. Review collected by and hosted on G2.com.
Validated through LinkedIn
This reviewer was offered a nominal gift card as thank you for completing this review.
Invitation from G2. This reviewer was offered a nominal gift card as thank you for completing this review.





