ByteDance, the owner of TikTok, has developed a “Self-Controlled Memory system” that enhances the capabilities of generative AI models like ChatGPT. This system allows the AI program to access a vast database of dialogue and characters, enabling it to answer questions about past events with superior accuracy. Unlike traditional language models, which have limitations on the amount of text they can process at once, ByteDance’s system overcomes this constraint by incorporating an organized memory that can augment its responses.

Researchers from the University of California at Santa Barbara and Microsoft have published a paper titled “Augmenting Language Models with Long-Term Memory” that introduces a new component to language models. Existing models like ChatGPT have input length limits that prevent them from processing long-form information beyond a fixed session. OpenAI’s GPT-3, for example, can only handle 2,000 tokens of input. While some attempts have been made to introduce memory into these models, they often suffer from stale data and computational complexities.

To address these challenges, the researchers propose a solution called “Language Models Augmented with Long-Term Memory” (LongMem). LongMem combines a traditional large language model with a second neural network called the SideNet. The language model stores relevant information in its memory bank while the SideNet compares the current prompt to the contents of memory to find relevant matches. Unlike other memory-based models, the SideNet can be trained separately, improving its ability to retrieve non-stale information from memory.

The researchers conducted tests comparing LongMem to the Memorizing Transformer and OpenAI’s GPT-2 language model. They used datasets that involved summarizing long texts, including articles and textbooks. LongMem outperformed all other models, including GPT-3, achieving a state-of-the-art score of 40.5%. Despite having significantly fewer neural parameters than GPT-3, LongMem demonstrated a strong understanding of long-range dependencies and outperformed other models on complex tasks.

This research aligns with recent work by ByteDance, which developed a “Self-Controlled Memory system” that enhances large language models’ input capacity. This system allows the model to store and access long sequences of previous interactions, improving its contextual understanding and response generation. The memory controller evaluates user prompts and determines whether to access the memory stream, which contains past interactions. ByteDance’s system has shown promising results, surpassing the performance of ChatGPT in contextual understanding and response generation.

The development of these memory-based systems represents a significant advancement in generative AI. By incorporating organized memory, these models can process and recall large amounts of information, enabling them to generate more accurate and contextually appropriate responses. These advancements have the potential to revolutionize various applications, including chatbots, virtual assistants, and content generation.

