Links: ppt
Words that appear next to images are correlated. We use this information to train correlated word predictors, and discover scene types.
Predictions
It was there and we didn't predict it
Scene Types
A scene is a collection of objects that are likely to be seen together. Knowing the scene helps us know what could be in the picture. But what kinds of scenes are there? (kitchens, mountains, then what?) Clustering on latent variables used for word prediction tells us automatically, without supervised data.
Each row is a different type of scene