Kabelsalat

http://philmassie.github.com

PU Learning

Positive/unknown class machine learning approaches

A challenge that keeps presenting itself at work is one of not having a labelled negative class in the context of needing to train a binary classifier. Typically, the issue is paired with horribly imbalanced data sets and pressed for time, I have often taken the simplistic route of sub-sampling the unknown set and treating them as unknowns. Obviously this isn’t ideal as the unknown set is contaminated and as a result the classifiers dont train that well.