The Shift in AI Data Work: From Gig Platforms to Professional Expertise


Posted on September 12, 2025  /  0 Comments

Handshake AI out-earned its parent company’s entire decade of revenue in just two years. The surprise? This came not from Handshake’s core career platform, but from an unexpected pivot to data annotation. Back in 2023, CEO Garret Lord realized their real asset was the deep expertise embedded within its vast network of 1,500+ universities and over a million employers. That same strength, he saw, could power something bigger: the data behind AI.

Handshake’s move into AI data work might have seemed like a bold experiment at the time, given that data annotation isn’t exactly PhD material. But is it?

The New Data Bottleneck

Despite the expectations set by OpenAI, the recent launch of GPT-5 did not impress. Instead of a big performance leap over its predecessors, the kind we’ve come to expect with any major release of this kind, it felt more like a sideways step. Ever since GPT-3, progress in large language models has been powered by scale. Bigger models trained on more data kept raising the bar. But GPT-5’s progress turned out to be massively underwhelming, hinting that scaling alone won’t take us much further. We’ve scraped the whole internet, and it might be that we have indeed run out of high-quality training data.

Synthetic data — the kind generated by LLMs — was a candidate fix for the data shortage. But research shows it’s not enough. While it can give a small performance boost, models still need fresh, high-quality real data to advance. The reason is simple: synthetic data is just remixing patterns from existing training data. It doesn’t add any new knowledge. And if you lean on it too heavily, you risk model collapse, where models gradually degrade as they train on their own outputs.

So, the real bottleneck isn’t quantity anymore, it’s quality.

The Rise of a New Data Industry

Data has always been at the heart of machine learning. Collecting, curating, and annotating data was always essential. But the work was often outsourced to low-skilled workers on crowdsourcing platforms because it was tedious, underpaid, and hardly seen as intellectually stimulating. Though today, AI has unironically disrupted this landscape. Now there’s unprecedented demand for highly trained experts from bachelor’s and master’s graduates to PhDs to create, evaluate, and refine new datasets.

Foundational LLMs are first trained on massive datasets scraped from the internet, then fine-tuned for specific tasks using methods like reinforcement learning from human feedback (RLHF) and supervised fine-tuning. The point of fine-tuning is to nudge the models to excel in targeted domains like math, coding, or medical diagnosis. That’s why domain-specific datasets are extremely valuable. AI labs and ML engineers are competing to get their hands on expertise-rich data. Core capabilities like object recognition, language translation, and speech-to-text are already widely adopted, but advanced reasoning in areas like calculus, scientific research, or legal analysis still lags behind.

So why does all this expert data work matter? Because the next leap in AI isn’t about recognizing cats in photos or transcribing a Zoom call. It’s about reasoning in domains where the cost of mistakes is high.

Take healthcare. A medical diagnostic model might already be able to spot common conditions on X-rays or MRIs. But without radiologists and specialists labeling edge-case scans and building balanced datasets, the model could miss critical diagnoses. The difference between 95% vs 98% accuracy isn’t just a percentage point — it could mean the difference between life and death.

In response, data work has taken specialized firms now connect networks of domain experts, from recent STEM graduates to experienced PhDs in medicine, law, and accounting, with AI companies hungry for better datasets. What do these experts actually do? Their jobs range from stress-testing existing LLMs to spot weak reasoning, to curating new multimodal datasets that fill those gaps, to benchmarking models before they’re released to the public.

The business model is already proving lucrative. It’s not just Handshake AI. Companies like Scale and Snorkel have built their names on this expertise-driven data economy. Scale AI is the standout: it reportedly pulled in $870 million in revenue in 2024, is on track to top $2 billion in 2025, and as of May 2024 held a valuation of about $13.8 billion.

Implications for Work and Professions

The biggest shift in AI data work is who’s actually shaping the datasets. Historically, annotation work was outsourced to low-wage labor markets, often in Africa and South or East Asia, via gig platforms. Work was inconsistent, underpaid, and offered little to no benefits or career progression. Business Process Outsourcing (BPO) firms like Sama and Connected Women introduced more structure and oversight, offering steadier employment and better quality control.

Fast forward to today, and the picture looks vastly different. What started as gig work at the margins of the AI industry has now become highly specialized, high-paying work at its core. Today’s AI data workers are often full-time employees with competitive salaries, better conditions, and clear contracts. Geography has shifted, too. What was once outsourced to low-wage contractors in the Global South is now increasingly handled in high-wage markets where domain expertise is abundant. But all this leaves one open question. What’s next for the experts?

Here’s the irony: today’s experts are training the very systems that aim to one day automate their jobs. Every dataset captures their reasoning patterns, methods, and problem-solving strategies distilled into something a model can learn. What happens to those experts next? Hard to say. Some roles will shrink, sure. But history suggests that professions adapt. A few years ago, no one had heard of prompt engineers or AI safety specialists. Now they’re real careers. In the same way, AI may not replace experts outright, but reshape their roles, expanding their scope instead of erasing them.

Another possibility is that expert-trained models will lower barriers to entry across specialized fields. With today’s LLMs that are trained on massive amounts of code, even non-programmers can create simple apps. Vibe coding — using prompts to guide generative AI to build software — is fast becoming a legitimate career path. While sophisticated systems are still best left to professional engineers, the floor for participation has clearly dropped. Similarly, expert-trained AI could open new entry-level roles in fields like other STEM domains, medicine, law, or finance, even for non-experts.

From Data Work to Better AI

Looking further ahead, a real breakthrough would be if the AI systems learned not only to mimic expertise but also to generate original research. Right now, LLMs excel at pattern matching and predicting the next token based on the training data they’ve been exposed to. But they’ve yet to show they can make genuine discoveries. That may be due to inherent structural limits or simply because they still lack the prerequisite skills and knowledge needed for it. But that’s changing. Data development efforts now extend to modelling expert workflows, from hypothesis generation to experiment design and result validation. In time, AI may autonomously explore research gaps, run surveys, execute simulations, and iterate on findings.

By Amanda Ariyaratne (Data Science Researcher, LIRNEasia)
Amanda is member of the Data, Algorithms and Policy (DAP) team at LIRNEasia – which participates in the policy dialogue around our algorithmically-inclined society with critical research and technical expertise.


References

Comments are closed.