Artificial Intelligence | Educational Science

Home

AI chatbots need to be much better at remembering things. Have scientists just cracked their terrible memory problem?

Artificial intelligence (AI) chatbots are terrible at remembering things — both between separate conversations and even during the same conversation. But two recent breakthroughs might completely change this.

If you talk to a large language model (LLM) like OpenAI's ChatGPT for long enough, it will begin to forget crucial pieces of information — especially if the conversation stretches on for more than 4 million words of input. Its performance then begins to deteriorate rapidly.

Meanwhile, ChatGPT and other LLMs can't retain information between conversations. For example, if you finish one conversation and reboot ChatGPT a week later, the chatbot won't remember anything from the previous exchange.

But two separate teams have potentially found solutions to these memory issues. A team of scientists led by the Massachusetts Institute of Technology (MIT) have pinpointed the reason AI forgets things mid-conversation and come up with a method to fix it, while developers at OpenAI have begun testing long-term memory, in which you can tell ChatGPT to remember parts of conversations, ask it what it remembers and later tell it to forget something — or wipe its memory completely.

Improving mid-conversation performance

The scientists found that they could improve chatbots' short-term memory by changing how the key-value cache — the chatbot's short-term memory — stores and replaces tokens, where one token is a chunk of input text. The scientists dubbed their new approach "StreamingLLM" and presented their findings in a paper published on Dec. 12, 2023 in the pre-print server arXiv.

Related: ChatGPT will lie, cheat and use insider trading when under pressure to make money, research shows

A chatbot's memory is limited, so it evicts the oldest tokens and replaces them with newer tokens as the conversation continues. But applying StreamingLLM to an LLM means it can retain the first four tokens — before evicting the fifth token onwards. This means it will still forget things — because of the nature of its limited memory — but remember the very first interactions.

The order of the tokens (and whether they are labeled first, second, third, and so on) also matters because they feed into an "attention map" for the active conversation. This maps out how strongly each token relates to other tokens.

For example, if the fifth token is evicted, you may expect the sixth token to become the new fifth token. But for StreamingLLM to work, tokens must remain encoded as they were originally. In this example, the sixth token must not be encoded as the new "fifth" token just because it is now fifth in line — but remain encoded as the sixth token.

These two changes mean a chatbot performs just as effectively beyond 4 million words as it did before, the scientists said in their paper. It's also 22 times faster than another short-term memory method that avoids performance crashing by constantly recomputing part of the earlier conversation.

"Now, with this method, we can persistently deploy these large language models. By making a chatbot that we can always chat with, and that can always respond to us based on our recent conversations, we could use these chatbots in some new applications," said study lead author Guangxuan Xiao, an electrical engineering and computer science graduate student at MIT, in a statement.

StreamingLLM has already been incorporated into Nvidia's open source LLM model optimization library called TensorRT-LLM — which is used by developers as a foundation for their own AI models. The researchers also plan to improve StreamingLLM by designing it to find and reincorporate tokens that have been evicted if they're needed again.

ChatGPT will never forget

OpenAI is also testing a method to improve ChatGPT's long-term memory, so that users can continue conversations and effectively build a working relationship with the AI chatbot.

When conversing with the LLM, users can ask ChatGPT to remember something specific or to grant it autonomy to remember elements of the conversation that it deems appropriate to store for later. These memories are not linked with specific conversations, so deleting chats does not erase memories — the memory itself must be deleted in a separate interface. Unless these are manually deleted, starting a new chat will pre-load ChatGPT with previously saved memories.