OpenAI Thinks LLMs Can Earn $1M from Freelance Software Engineering Tasks

1 month ago 25
  • Published on February 19, 2025
  • In AI News

OpenAI is Likely To Pull the Plug on ChatGPT

OpenAI has introduced SWELancer, a new benchmark to test whether frontier large language models (LLMs) can successfully complete real-world freelance software engineering tasks—and even earn up to $1 million in total payouts. The evaluation is based on 1,488 freelance software engineering jobs from Upwork, collectively valued at $1 million.

SWE-Lancer comprises over 1,400 software engineering tasks, with projects ranging from $50 bug fixes to $32,000 feature implementations. 

“Introducing SWE-Lancer: our most realistic coding benchmark to date. Still some limitations, but better than evals we had before,” said Tejal Patwardhan, who works on the benchmarks and preparedness team at OpenAI.

These tasks are divided into independent engineering tasks, where models must complete technical work, and managerial decision-making tasks, where models evaluate and choose between implementation proposals.

By mapping AI model performance to real-world monetary value, SWE-Lancer provides a crucial tool for studying the economic impact of AI in software development. More research can be accessed here.

Anthropic, the company behind the Claude model series, also released a survey highlighting AI’s influence on the workplace.

The findings revealed that approximately 36% of all occupations incorporate AI for at least a quarter of their tasks. Moreover, 57% of AI applications enhance human capabilities, while 43% focus on automation. However, only 4% of occupations rely on AI for at least 75% of their tasks.

The study identified software development and technical writing as key areas where AI is utilised. In contrast, AI plays a minimal role in tasks that involve physical interaction with the environment.

Picture of Aditi Suresh

Aditi Suresh

I hold a degree in political science, and am interested in how AI and online culture intersect. I can be reached at [email protected]

Association of Data Scientists

GenAI Corporate Training Programs

India's Biggest Women in Tech Summit

Mar 20 and 21, 2025 | 📍 J N Tata Auditorium, Bengaluru

Download the easiest way to
stay informed

Netflix Would Sink Without Iceberg

Siddharth Jindal

Apache Iceberg’s table format is ideal for large data lakes and integrates easily with Spark, Flink, Hive, Presto, and more.

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

Rising 2025 | DE&I in Tech & AI

Mar 20 and 21, 2025 | 📍 J N Tata Auditorium, Bengaluru

AI Startups Conference.
April 25, 2025 | 📍 Hotel Radisson Blue, Bangalore, India

Data Engineering Summit 2025

15-16 May, 2025 | 📍 Taj Yeshwantpur, Bengaluru, India

MachineCon GCC Summit 2025

19-20th June 2025 | 📍 ITC Grand, Goa

17-19 September, 2025 | 📍KTPO, Whitefield, Bangalore, India

India's Biggest Developers Summit Nimhans Convention Center, Bangalore

discord icon

Our Discord Community for AI Ecosystem.

Read Entire Article