Meta Drops COCONUT, Breaks Chain of Thought Reasoning

4 months ago 37

Last updated December 10, 2024
In AI News

Is it as good as OpenAI's o1, though?

Meta’s FAIR (The Fundamental AI Research) on Monday unveiled a new research study that explores a ‘Chain of Continuous Thought’ technique, or COCONUT.

This overcomes the limitations of the Chain of Thought, or CoT technique, where the explicit reasoning process is generated in natural language tokens.

“Chain-of-thought (CoT) reasoning involves prompting or training LLMs to generate solutions step-by-step using natural language. However, this is in stark contrast to certain human cognition results,” said the researchers.

Meta provides an analogy and cites neuroimaging studies which show that ‘language network – a set of brain regions responsible for language comprehension and production – remain largely inactive during various reasoning tasks.’

This gives rise to an issue where the amount of reasoning needed for each particular token varies depending on the complexity of the problem. Still, LLMs allocate ‘nearly the same computing budget for predicting every token’.

Meta explores reasoning in an abstract manner, which involves modifying the CoT process. Instead of making the model convert its internal thinking into words after each step, COCONUT uses its internal thinking as a starting point for the subsequent step.

“This modification frees the reasoning from being within the language space, and the system

can be optimised end-to-end by gradient descent, as continuous thoughts are fully differentiable,” mentioned the authors.

“Coconut outperforms CoT in certain logical reasoning tasks that require substantial backtracking during planning, with fewer thinking tokens during inference,” they added.

A few days ago, OpenAI released the full version of the o1 model. A user on Reddit compared the $200 o1 Pro model against Claude 3.5 and said that the former is marginally better at reasoning and excels at PhD level questions.

OpenAI uses a combination of CoT and Reinforcement Learning techniques to help the model reason.

Debarghya Das, a VC at Menlo Ventures, said on X “Two weeks ago, research said no LLM could solve NYT Connections — a simple game where you group 16 words into 4 groups of 4.”

“o1-pro solves it consistently in one shot.”

Meta has also had its fair share of action over the last few days. Meta unveiled Llama 3.3 with 70 billion parameters, which is said to be as good as their flagship 405B parameter model, yet optimised for efficiency.

“As we continue to explore new post-training techniques, today we’re releasing Llama 3.3 — a new open source model that delivers leading performance and quality across text-based use cases such as synthetic data generation at a fraction of the inference cost,” said Meta on a post in X.

So, what’s next for Meta? It is indeed the fourth iteration of the beloved Llama.

Mark Zuckerberg, CEO at Meta, in their Q3 24 earnings call, said, “I expect that the smaller Llama 4 models will be ready first, and we expect [them] sometime early next year, and I think that they’re going to be a big deal on several fronts — new modalities, capabilities, stronger reasoning, and much faster,” confirming the release of Llama 4 next year.

Ahmad Al-Dahle, VP of GenAI at Meta, said in a post on X, “Great to visit one of our data centres where we’re training Llama 4 models on a cluster bigger than 100K H100s! So proud of the incredible work we’re doing to advance our products, the AI field and the open-source community.”

[This story has been read by 11 unique individuals.]

Supreeth Koundinya

Supreeth is an engineering graduate who is curious about the world of artificial intelligence and loves to write stories on how it is solving problems and shaping the future of humanity.