- Published on February 24, 2025
- In AI News
The kernel supports BF16 and features a paged KV cache with a block size of 64.

DeepSeek, a Chinese artificial intelligence (AI) lab by High-Flyer startup, has kicked off its “Open Source Week” by releasing FlashMLA, a decoding kernel designed for Hopper GPUs. It is optimised for processing variable-length sequences and is now in production.
The kernel supports BF16 and features a paged KV cache with a block size of 64. On the H800 GPU, it achieves speeds of 3000 GB/s in memory-bound configurations and 580 TFLOPS in compute-bound configurations.
DeepSeek says FlashMLA is inspired by projects like FlashAttention 2&3 and Cutlass. The kernel is available on GitHub for exploration and use.
“Honored to share FlashMLA – our efficient MLA decoding kernel for Hopper GPUs, optimised for variable-length sequences and now in production,” the company said in a post on X.
The release of FlashMLA is expected to improve computational efficiency, particularly in applications involving AI and potentially impacting sectors like cryptocurrency trading algorithms. FlashMLA, available on GitHub, offers high performance with speeds of up to 3000 GB/s for memory tasks and 580 TFLOPS for computing.
DeepSeek recently announced it is launching five open-source repositories starting this week. “We’re a tiny team (at) DeepSeek exploring AGI (Artificial General Intelligence). Starting next week, we’ll be open-sourcing five repos, sharing our small but sincere progress with full transparency,” it said on X.
Currently, it has a collection of 14 open-source models and repositories on Hugging Face.
Recently, it released its DeepSeek-R1 and DeepSeek-V3 models. These AI models offer state-of-the-art performance while being trained and deployed at a fraction of the cost of their competitors.
Siddharth Jindal
Siddharth is a media graduate who loves to explore tech through journalism and putting forward ideas worth pondering about in the era of artificial intelligence.
Subscribe to The Belamy: Our Weekly Newsletter
Biggest AI stories, delivered to your inbox every week.
Rising 2025 Women in Tech & AI
March 20 and 21, 2025 | 📍 NIMHANS Convention Center, Bengaluru
AI Startups Conference.April 25, 2025 | 📍 Hotel Radisson Blue, Bangalore, India
Data Engineering Summit 2025
May 15-16, 2025 | 📍 Hotel Radisson Blu, Bengaluru
MachineCon GCC Summit 2025
June 20-22, 2025 | 📍 ITC Grand, Goa
Sep 17-19, 2025 | 📍KTPO, Whitefield, Bangalore, India
India's Biggest Developers Summit Feb, 2025 | 📍Nimhans Convention Center, Bangalore
Our Discord Community for AI Ecosystem.