What are the best synthetic models for neurogenesis in highly complex continuous learning settings?
Research Unit 3, SCIoI Project 45
Most current supervised deep learning approaches require a predefined neural network topology. This means that a data scientist is forced to make a hard decision on the size and capacity of the neural net, for instance by specifying the number of layers and the number of neurons in each layer. This approach has the principal disadvantage that the network size does not directly follow from the problem to be solved and that (1) this requires human data scientists to guess the complexity of a problem and spend much time manually tuning network parameters and (2) it has been shown prone to catastrophic forgetting in continuous learning (lifelong learning) scenarios in which the same network is trained with several tasks in sequence, and where latter tasks overwrite the knowledge from earlier tasks in the network.
Looking to nature, we find a very different situation in that the animal brain is not fixed-size throughout its lifespan. Especially in early developmental stages, the brain undergoes a process termed neurogenesis in which new neurons are progressively added to the growing brain. But this is not limited to early stages: even in adult brains, neurogenesis is believed to take place and contribute to the formation of new memories – albeit the extent of neurogenesis in adults is still a matter of investigation. Most importantly, newly formed neurons are hypothesized to have a higher plasticity than older neurons. This means that older neurons are more stable over time, preserving already existing knowledge, while newer ones are more changeable, better absorbing new information. Neurogenesis thus is a crucial component to address the “stability-plasticity dilemma” that allows us to learn new information while also preserving long-term knowledge.
In this project, we propose to derive and test synthetic models for neurogenesis in highly complex continuous learning settings. Our main goals are to derive models that (1) reduce the need to manually predefine network topologies, (2) are better at continuously learning in scenarios in which new information types might constantly be encountered, and (3) are able to restructure in a modular and adaptive way.