machine learning - How different does each output unit have to be? -
i'll use example of classifying pumpkins. take example of cinderella pumpkin
versus gourd pumpkin
intuitively, may seem wise classify these images 2 different outputs, cinderella-pumpkin
, gourd-pumpkin
, due how different look.
my question is, if take training set of images includes both cinderella pumpkins , gourd pumpkins , classify both of them under category of pumpkin
, performance of network worse if instead separated them 2 categories? threshold when 2 objects different should put separate categories?
or take more extreme example sake of clarity, if took pictures of cats , pictures of pineapples , classified them under same category, how ability of network affected in classifying each respective object in comparison if 1 created cat
output , pineapple
output?
it depends on inherent similarity of training observations. don't set threshold: use power iteration clustering (or other unsupervised classification) guide me on there significant divisions in training data. k-means popular choice, since it's common implementation, , relatively easy comprehend.
the other consideration similarity of "non-pumpkin" data, such basketball (compared cinderella). again, take unsupervised learning approach. in case, expect basketball plot closer cinderella either gourd. suggests separate classes pumpkin types -- or perhaps more feature detection in image processing, find similarities across pumpkin varieties.
does help?
Comments
Post a Comment