Ilya Sutskever, often touted as the GOAT of AI, believes superintelligent AI will be ‘unpredictable.
When the co-founder and former chief scientist of OpenAI, Ilya Sutskevar speaks, the world listens. At NeurIPS 2024, he touched upon the unpredictability of reasoning in AI, saying that the more a system reasons, the more unpredictable it becomes.
“The more it reasons, the more unpredictable it becomes,” he said. “All the deep learning that we’ve been used to is very predictable because we’ve been working on replicating human intuition.”
Sutskever pointed out that systems capable of reasoning, such as advanced chess-playing AI (the likes of AlphaZero, etc.), are already demonstrating unpredictability. “The best chess AIs are unpredictable to the best human chess players,” he said.
It is only about time, these AI systems start getting smarter, to an extent of achieving superintelligence. He said that in the future artificial superintelligent systems (ASI) will be able to understand complex concepts from limited data and will no longer be prone to confusion.
Sutskever said that reasoning models will reduce errors like hallucinations by “autocorrecting” themselves in far more sophisticated ways. “AI systems that reason will be able to correct themselves, much like autocorrect—but far grander,” he added.
“They will understand things from limited data. They will not get confused,” he said, hinting at a possibility of ‘self-aware AI,’ which he views as a natural development. “Self-awareness is part of our own world models,” he said.
Sutskever believes artificial superintelligence will evolve into truly agentic systems. “Right now, the systems are not agents in any meaningful sense, just very slightly agentic,” he said. “But eventually, those systems are actually going to be agentic in real ways.”
In June 2024, Sutskevar launched his new AI startup called Safe Superintelligence Inc. (SSI), alongside Daniel Gross (former head of Apple AI) and Daniel Levy (investor and AI researcher). SSI is dedicated to developing safe and advanced AI systems, with a primary goal of achieving ‘safe superintelligence.’
Unlike many AI companies, it focuses on long-term safety and progress, avoiding the pressure of quick profits and product releases.
The End of Pre-Training Era
Sutskevar said that the age of pre-training is over.
“Pre-training as we know it will unquestionably end,” he said, citing the limitations of data availability. “We have but one internet. You could even go as far as to say that data is the fossil fuel of AI. It was created somehow, and now we use it.”
He acknowledged, saying that AI’s current progress stems from scaling models and data, but other scaling principles might emerge. “I want to highlight that what we are scaling now is just the first thing we figured out how to scale,” said Sutskever.
Citing OpenAI’s o1, he highlighted the growing focus on agents and synthetic data as pivotal to the future of AI, while acknowledging the challenges in defining synthetic data and optimising inference time compute. “People feel like agents are the future, more concretely, but also a little bit vaguely, synthetic data,” he said.
Drawing parallels to biological systems, Sutskever spoke about how nature might inspire the next breakthroughs. He referenced brain-to-body size scaling in mammals as a potential model for rethinking AI’s architecture.
Instead of linear improvements through scaling datasets and models, future AI systems might adopt entirely new scaling principles, guided by biology’s efficiency and adaptability. “There’s a precedent in biology for different scaling,” he said, suggesting that AI could evolve in ways we have yet to fully understand.
Walking Down the Memory Lane
Sutskever opened his talk at NeurIPS 2024 by revisiting a presentation from 10 years ago, where he and his colleagues introduced the concept of training large neural networks for tasks like translation. “If you have a large neural network with 10 layers, it can do anything that a human being can do in a fraction of a second,” he quipped.
This idea was rooted in the belief that artificial neurons could mimic biological neurons, with the assumption that the human brain’s ability to process information quickly could be replicated in a neural network
Sutskever pointed out how early models, including LSTMs, relied on basic parallelisation techniques like pipelining. He shared how these models used one layer per GPU to speed up training, achieving a 3.5x speedup with eight GPUs.Sutskever also touched on the origins of the scaling hypothesis, which posits that success in AI is guaranteed when larger datasets and neural networks are combined. He credited OpenAI’s Alec Radford, Anthropic’s Dario Amodei and Jared Kaplan for their roles in advancing this concept and laying the groundwork for the GPT models
Siddharth Jindal
Siddharth is a media graduate who loves to explore tech through journalism and putting forward ideas worth pondering about in the era of artificial intelligence.
Subscribe to The Belamy: Our Weekly Newsletter
Biggest AI stories, delivered to your inbox every week.
February 5 – 7, 2025 | Nimhans Convention Center, Bangalore
Rising 2025 | DE&I in Tech & AI
Mar 20 and 21, 2025 | 📍 J N Tata Auditorium, Bengaluru
Data Engineering Summit 2025
May, 2025 | 📍 Bangalore, India
MachineCon GCC Summit 2025
June 2025 | 583 Park Avenue, New York
September, 2025 | 📍Bangalore, India
MachineCon GCC Summit 2025
The Most Powerful GCC Summit of the year
Our Discord Community for AI Ecosystem, In collaboration with NVIDIA.