Chain of Draft Prompts lets LLMs think cheaper with fewer words

1 month ago 23

A new method called "Chain of Draft" (CoD) helps AI models complete complex tasks using significantly fewer words and greater speed, while maintaining accuracy levels comparable to existing approaches.

CoD generates concise yet informative intermediate results, solving tasks with up to 92.4% fewer words compared to the established Chain of Thought (CoT) method—without any loss in accuracy. The inspiration for CoD comes from human behavior: rather than detailing every thought, people often jot down only essential points in brief bullet form. CoD mimics this strategy.

 Jason had 20 lollipops. He gave Denny somelollipops. Now Jason has 12 lollipops. How many lollipops did Jason give to Denny? A: 8Example of input and output for a calculation task. | Image: Xu et al.
 Jason had 20 lollipops. He gave Denny some lollipops. Now Jason has 12 lollipops. How many lollipops did Jason give to Denny? A: Let’s think through this step by step: 1. Initially, Jason had 20 lollipops. 2. After giving some to Denny, Jason now has 12 lollipops. 3. To find out how many lollipops Jason gave to Denny, we need to calculate the difference between the initial number of lollipops and the remaining number. 4. We can set up a simple subtraction problem: Initial number of lollipops - Remaining number of lollipops = Lollipops given to Denny 5. Putting in the numbers: 20 - 12 = Lollipops given to Denny 6. Solving the subtraction: 20 - 12 = 8 Therefore, Jason gave 8 lollipops to Denny. #### 8 lollipopsDetailed reasoning steps using the Chain of Thought method. | Image: Xu et al.
 Jason had 20 lollipops. He gave Denny somelollipops. Now Jason has 12 lollipops. How many lollipops did Jason give to Denny? A: 20 - x = 12; x = 20 - 12 = 8. #### 8Significantly shorter reasoning steps using Chain of Draft. | Image: Xu et al.

While the test prompts remain identical across all three examples, the difference lies in the system prompt. For CoD, researchers modified a chain-of-thought (CoT) prompt to limit each step to a maximum of five words.

Instructions given to the AI model for Chain of Draft prompting. | Image: Xu et al.

Short prompts deliver similar accuracy with fewer resources

The researchers compared CoD to detailed CoT prompts and standard prompts lacking explanatory steps. In arithmetic, comprehension, and symbolic reasoning tasks, CoD achieved similar accuracy to detailed CoT, but used 68 to 86 percent fewer words.

Ad

THE DECODER Newsletter

The most important AI news straight to your inbox.

✓ Weekly

✓ Free

✓ Cancel at any time

For example, when solving comprehension tasks involving dates, CoD increased accuracy compared to standard prompts from 72.6 to 88.1 percent for GPT-4o and from 84.3 to 89.7 percent for Claude 3.5 Sonnet.

 Performance-Metriken für GPT-4o und Claude 3.5 mit verschiedenen Prompting-Methoden (Standard, CoT, CoD), inkl. Accuracy, Token, Latenz.Chain of Draft achieves comparable performance while using significantly fewer tokens. | Image: Xu et al.

CoD reduces computational costs and response times

Chain of Draft directly reduces the number of output tokens by generating shorter intermediate reasoning steps. Additionally, it indirectly lowers input token counts, especially in few-shot prompting scenarios, where multiple solved examples are included as part of the initial input prompt.

When these few-shot examples are created using the concise CoD format, each example becomes shorter, resulting in fewer tokens overall. This combined reduction in input and output tokens lowers computational costs, enables faster responses, and makes CoD particularly valuable for large-scale LLM implementations and cost-sensitive applications.

However, compact prompts are not suitable for every task. Some scenarios require extended consideration, self-correction, or external knowledge retrieval. To address these limitations, researchers propose combining CoD with complementary approaches such as adaptive parallel reasoning or multi-level validation. Additionally, these findings could inform future AI model training by incorporating compact reasoning processes into training datasets.

The Chain of Draft method comes from Zoom Communications' research team, which has offered an "AI Companion" for meeting assistance since 2023. While response latency has often been overlooked in AI applications, CoD could prove especially valuable for real-time situations like video calls.

Recommendation

Read Entire Article