Abstract: Models of vision have come far in the past 10 years. Deep neural networks can recognise objects with near-human accuracy, and predict brain activity in high-level visual regions. However, most networks require supervised training using ground-truth labels for millions of images, whereas brains must somehow learn from sensory experience alone. We have been using




