Artificial intelligence (AI) chatbots are terrible at remembering things — both between separate conversations and even during the same conversation. But two recent breakthroughs might completely change this.
If you talk to a large language model (LLM) like OpenAI's ChatGPT for long enough, it will begin to forget crucial pieces of information — especially if the conversation stretches on for more than 4 million words of input. Its performance then begins to deteriorate rapidly.
Meanwhile, ChatGPT and other LLMs can't retain information between conversations. For example, if you finish one conversation and reboot ChatGPT a week later, the chatbot won't remember anything from the previous exchange.
But two separate teams have potentially found solutions to these memory issues. A team of scientists led by the Massachusetts Institute of Technology (MIT) have pinpointed the reason AI forgets things mid-conversation and come up with a method to fix it, while developers at OpenAI have begun testing long-term memory, in which you can tell ChatGPT to remember parts of conversations, ask it what it remembers and later tell it to forget something — or wipe its memory completely.
Related: ChatGPT will lie, cheat and use insider trading when under pressure to make money, research shows
A chatbot's memory is limited, so it evicts the oldest tokens and replaces them with newer tokens as the conversation continues. But applying StreamingLLM to an LLM means it can retain the first four tokens — before evicting the fifth token onwards. This means it will still forget things — because of the nature of its limited memory — but remember the very first interactions.
The order of the tokens (and whether they are labeled first, second, third, and so on) also matters because they feed into an "attention map" for the active conversation. This maps out how strongly each token relates to other tokens.
For example, if the fifth token is evicted, you may expect the sixth token to become the new fifth token. But for StreamingLLM to work, tokens must remain encoded as they were originally. In this example, the sixth token must not be encoded as the new "fifth" token just because it is now fifth in line — but remain encoded as the sixth token.
These two changes mean a chatbot performs just as effectively beyond 4 million words as it did before, the scientists said in their paper. It's also 22 times faster than another short-term memory method that avoids performance crashing by constantly recomputing part of the earlier conversation.
"Now, with this method, we can persistently deploy these large language models. By making a chatbot that we can always chat with, and that can always respond to us based on our recent conversations, we could use these chatbots in some new applications," said study lead author Guangxuan Xiao, an electrical engineering and computer science graduate student at MIT, in a statement.
StreamingLLM has already been incorporated into Nvidia's open source LLM model optimization library called TensorRT-LLM — which is used by developers as a foundation for their own AI models. The researchers also plan to improve StreamingLLM by designing it to find and reincorporate tokens that have been evicted if they're needed again.
When conversing with the LLM, users can ask ChatGPT to remember something specific or to grant it autonomy to remember elements of the conversation that it deems appropriate to store for later. These memories are not linked with specific conversations, so deleting chats does not erase memories — the memory itself must be deleted in a separate interface. Unless these are manually deleted, starting a new chat will pre-load ChatGPT with previously saved memories.
RELATED STORIES
—Poisoned AI went rogue during training and couldn't be taught to behave again in 'legitimately scary' study
—Last year AI entered our lives — is 2024 the year it'll change them?
—3 scary breakthroughs AI will make in 2024
OpenAI provided several examples of how this would be useful. In one example, the chatbot remembers that a kindergarten teacher with 25 students prefers 50-minute lessons with follow-up activities, and recalls this information when helping them create a lesson plan. In another, somebody tells ChatGPT their toddler loves jellyfish — and the AI tool remembers this when designing a birthday card for them.
The company has rolled out the new memory features to a small portion of ChatGPT users, representatives said in a statement on Feb. 13, ahead of a planned broader rollout to all users.
OpenAI will use information from memories to improve its models, company representatives said in the statement. They added, however, that scientists are taking steps to assess and mitigate biases and prevent ChatGPT from remembering sensitive information like health details unless a user explicitly asks it to. Users with memory access can also use a "temporary chat" in which memory is deactivated entirely.