- Published on January 15, 2025
- In AI News
Titans are more effective than Transformers and modern linear RNNs.
Google has recently launched new neural long-term memory modules called ‘Titans’ to improve how machines handle large amounts of information over time.
The architecture, created by researchers Ali Behrouz, Peilin Zhong, and Vahab Mirrokni, has been designed to combine short-term and long-term memory to solve problems traditional AI struggles with.
“Titans are implemented in Pytorch and JAX, and we intend to make the code we used to train and evaluate our models available soon,” the researchers mentioned in the official paper.
Better Than Transformers
The researchers tested Titans on tasks such as language modelling, long-term reasoning, and time series forecasting. It outperformed existing architectures like Transformers and Recurrent Neural Networks (RNNs), which demonstrated its ability to process long sequences more efficiently.
On the BABILong benchmark, the Memory as Context (MAC) variant achieved exceptional results. “Titans are more effective than Transformers and modern linear RNNs,” Behrouz announced on X.
In the BABILong benchmark, Titans (MAC) shows outstanding performance, where it effectively scales to larger than 2M context window, outperforming large models like GPT-4, Llama3 + RAG, and Llama3-70B. pic.twitter.com/ZdngmtGIoW
— Ali Behrouz (@behrouz_ali) January 13, 2025The development could benefit applications like document analysis, time series forecasting, and genomics. By combining long-term memory with current data, Titans may improve machine learning systems’ ability to solve complex, real-world problems.
Pablo Horneman, an AI strategist, explained that short-term memory uses standard attention for the current context, while the neural memory module efficiently manages distant dependencies.
This architecture ensures balanced processing of recent and historical data and overcomes limitations in handling long sequences.
How Does it Work?
Horneman took to LinkedIn to give his insights on the key differences between traditional standard attention mechanisms and Titans.
Transformers are effective for short-term tasks but require significant computing power for longer contexts. Newer models are faster but often lose important details over time.
Titans combine attention mechanisms with a neural long-term memory module, enabling the model to memorise and utilise information during test time.
The Titans architecture introduces a neural memory module that learns what to remember and what to forget during real-time operations. This approach allows it to handle millions of data points without losing accuracy.
Titans introduce three architectural variants: MAC, Memory as Gating (MAG), and Memory as a Layer (MAL).
In the MAC configuration, Titans segment inputs, even those as large as context windows in current LLMs, retrieve historical memory for relevant segments and update the memory based on attention outputs. Each variant has strengths and is suited for different tasks.
Behrouz explained that Titans’ innovation lies in mimicking human memory. While our short-term memory is highly accurate but limited to brief windows, Titans rely on other memory systems to store information for the long term.
Similarly, Titans use attention as short-term memory for immediate dependencies and a neural memory module as long-term memory to capture distant dependencies. This design effectively balances recent and historical data.
Drawing further inspiration from how humans prioritise memorable events, Titans determine which tokens to store based on their ‘surprise’ value.
Events that violate expectations trigger attention, but the initial surprise and the cumulative decay of relevance over time drive long-term memorisation.
Sanjana Gupta
An information designer who loves to learn about and try new developments in the field of tech and AI. She likes to spend her spare time reading and exploring absurdism in literature.
Subscribe to The Belamy: Our Weekly Newsletter
Biggest AI stories, delivered to your inbox every week.
February 5 – 7, 2025 | Nimhans Convention Center, Bangalore
Rising 2025 | DE&I in Tech & AI
Mar 20 and 21, 2025 | 📍 J N Tata Auditorium, Bengaluru
Data Engineering Summit 2025
15-16 May, 2025 | 📍 Taj Yeshwantpur, Bengaluru, India
AI Startups Conference.
April 25 /
Hotel Radisson Blu /
Bangalore, India
17-19 September, 2025 | 📍KTPO, Whitefield, Bangalore, India
MachineCon GCC Summit 2025
19-20th June 2025 | Bangalore
Our Discord Community for AI Ecosystem.