Artificial intelligence is meant to learn in stages. Humans update their knowledge base with new information over time, but the training phase is when the system learns. They can't keep learning because their knowledge is frozen after that. If you want to learn a new thing, you have to train it again. If you ever met a new person, the only way you could learn her name was to change your mind. Training from scratch can lead to a behavior known as catastrophic forgetting, where a machine incorporating new knowledge at the cost of forgetting nearly everything it's already learned. The way that neural networks learn new things is what makes this situation arise. Learning involves changing the strength of connections between different types of brain cells. This process is difficult. Changing Neural connections too much will make them forget. Over the course of hundreds of millions of years, neural networks have evolved to keep important information stable. Artificial neural networks have a hard time balancing new and old knowledge. A sudden and severe failure to recognize past information can be caused by their connections being easily overwritten by the network. Christopher Kanan is a computer scientist at the University of Rochester and he helped establish a new field of artificial intelligence called continual learning. He wants to keep learning new things from continuous streams of data and not forget what happened before.
Kanan has been interested in machine intelligence for a long time. He taught bots to play early multi-player computer games when he was a kid. He wondered about the possibility of a machine that could think like a human. He majored in philosophy and computer science at Oklahoma State University in order to better understand how minds work. Kanan finds inspiration in both video games and watching his daughter learn about the world, with each new experience building on the last. Catastrophic forgetting is no longer as bad as it used to be because of his work. Machine memories, breaking the rules of training neural networks, and whether artificial intelligence will ever achieve human-level learning were some of the topics discussed by Kanan. The interview has been edited to make it clearer. I have found it to be very useful as an academic. How do you make reasoned arguments is one of the topics taught by philosophy. A lot of what you do in science is done that way. I have essays on the failures of the Turing test and other things from that time period. I keep thinking about those things. If we can't do X how are we going to do Y? Neural networks don't learn over time You teach them once. After that, it's a fixed entity. If you wanted to make artificial general intelligence, you would have to solve that fundamental problem. You're not going to get there if it can't learn on its own. That is a requirement for me to be able to do that. When replay is used, it stores past experiences and then replays them during training to make sure they are not lost. It is inspired by memory consolidation in our brain, where during sleep the high-levelEncodings of the day's activities are "replayed" as the neurons reactivation. New learning can't completely eradicate past learning since we are mixing in past experiences There are different ways to do this. The most common style is "veridical replay," where researchers store a subset of the raw inputs and then mix them with images from the past in order to learn new images. The second approach replays the images. generative replay is one of the less common methods. An artificial neural network creates a synthetic version of a previous experience and mixes it with new examples. The latter two methods have been the focus of my lab. Replay isn't a very satisfying solutionHow does your training in philosophy impact the way you think about your work?
How have researchers dealt with catastrophic forgetting so far?
Every concept that the neural network has learned in the past has to be kept in a safe place. The hypothesis is that you and I have replays of a recent experience to prevent forgetting it. That's not true in the way we do it. It doesn't have to store everything it's seen, but it does have to keep track of everything it's learned in the past It's not clear what it should hold. It doesn't seem like it's all the way there. It's not quite right. The big open questions in the field of continual learning are not in danger of being forgotten. I want to know how past learning makes future learning more efficient. Learning something in the future corrects the mistakes of the past. Those are things that not a lot of people are measuring, and I think it is important to do so in order to push the field forward. The goal is to become a better learner. The field is missing the forest for the trees A lot of the community is setting up the problem in ways that don't match the applications. We can't have everyone playing with the same toys. You have to ask what our gauntlet task is. How do we move forward? I can't tell you what's going to happen. Students are the ones who do most of the work. The setup of what others have done is being copied by them. Even if the new algorithms aren't enabling us to make significant progress in learning, it's more likely that a publication will come from it. The same type of work is produced by large companies who don't have the same incentives, except for intern driven work.If we could completely solve catastrophic forgetting, would that mean AI could learn new things continuously over time?
Then why do you think most people are focusing on those simple problems?
This work isn't trivial. To measure whether past learning helps future learning, we need to set up an experiment. We don't have good data sets for studying continuous learning We're taking existing data sets that are used in machine learning and using them to make new data sets. Our modern neural networks don't have as much going on in our heads. We have a training set, a test set, and a test set when we teach machine learning. Continual learning is not allowed. As you learn, your training set becomes something else. Existing data sets are the only ones we are limited to. This needs to be worked on. We need a good environment in which we can learn. It is easier to know what it is than what it is. I was on a panel where we identified this as a critical problem, but I don't think anyone has the answer right away. I can let you know if it has any properties. Let's assume the artificial intelligence is not embodied in simulations. We hope to do more than just classification of static images by learning from videos, multimodal video streams, and similar things. There are many questions regarding this. I was in a continual learning workshop a few years ago and some people like me were saying, "We've got to stop using MNIST, it's too simple." Someone said, "Alright, let's doIncremental Learning of StarCraft." For various reasons, I am doing that as well, but I don't think that really gets at it. Learning to play StarCraft doesn't make life any better.What would the ideal continual learning environment look like?
I invented a continual learning task with Tyler. Transfer learning is when you acquire skills and need to use more complex skills to solve more complex problems. How well you learn something in the past helps you in the future, and how well you learn it in the past helps you in the future. Good evidence for transfer was more important than object recognition. Large batches of examples are used in a lot of continuous learning setups. They will say to the program, "Here's 100,000 things, learn them." Learn the next 100,000 things. That doesn't match what I mean by "Here's one new thing; learn it." This is a new thing, learn it. Making progress in this field can be done through that avenue. I can see that my daughter is capable of one-shot learning, where she sees me do something once and she can copy it immediately, despite people telling me that I am just obsessed with development now that I have a child. Machine learning can't do that anymore. It made me open my eyes. Our modern neural networks don't have as much going on in our heads. The idea of learning over time is what the field needs to go towards. I believe they will. It's definitely true. There are so many people working in the field that it is more promising. We need more ideas. A lot of the culture in the machine learning community is following the leader. When we figure out how to make our algorithms for the correct architectures, we will have more of our capabilities than they have today. There isn't an argument that says it's impossible.Your lab also focuses on training algorithms to learn continuously from one example at a time, or from very small sets of examples. How does that help?
If we want AI to learn more like us, should we also aim to replicate how humans learn different things at different ages, always refining our knowledge?
Do you think AI will ever really learn the same way humans do?