Subscribe to mailing list

Get notified when we have new updates or new posts!

Subscribe Unicorn Data Science cover image
jen@unicornds.org profile image jen@unicornds.org

The Future of Data Science is Agentic Ops

AI agents are increasingly functioning like workers in our organizations -- yet most companies have no system to manage, evaluate, or course-correct them. That gap is data science's next big opportunity.

The Future of Data Science is Agentic Ops
Photo by Eric Krull / Unsplash

Ask data scientists how the job market feels right now, and many will describe a quiet tension. On paper, the outlook is strong — the World Economic Forum's Future of Jobs Report 2025 ranked data roles at the very top of the fastest-growing jobs list. And yet the field is changing in ways that are hard to ignore on the ground.

Over the past decade, machine learning work — building, deploying, and maintaining models — has gradually migrated toward engineering. MLOps and ML engineering became their own disciplines. Meanwhile, routine analytics work is increasingly assisted by AI tools. The data scientist who once owned the entire stack now operates in a more fragmented and also more competitive landscape.

So where should we go from here?

Agents Are Becoming Workers

The most significant shift happening right now is the rise of autonomous AI agents. Many of us rely on agents every day to get work done. Many of us already have agents running in our products and internal systems. In a very real sense, they are beginning to function like colleagues.

Data backs this up. Gartner predicts that 40% of enterprise applications will integrate task-specific agents by end of 2026, up from less than 5% in 2025. These agents are being embedded into internal workflows as well as customer-facing systems. They are, in many organizations, autonomously executing tasks, making decisions, and producing outputs that real stakeholders depend on. Effectively, agents are operating like workers.

Cue the "what could possibly go wrong" meme. We are now deploying fleets of agentic workers we have yet to learn how to manage effectively.

From People Ops to Agentic Ops

In 2015, Laszlo Bock published "Work Rules!", documenting how Google had rebranded HR into "People Ops." The insight was that managing people well goes much beyond administrative work – it is strategic work requiring data, feedback loops, and continuous improvement.

The same logic applies to agents, perhaps even more urgently. A confused employee will ask questions, push back, or show visible signs of inaction. An agent will not – it will confidently produce bad output with lightning speed. Agents hallucinate, drift, and respond in subtly wrong ways when prompts shift. Without oversight, failures accumulate quietly in production.

Therefore, we need to start taking the concept of "Agentic Ops" seriously. "Agentic Ops" is the rising organizational function that keeps agents reliable over time: defining what good looks like, measuring it systematically, and closing the feedback loop continuously.

What Agent Evaluation Actually Looks Like

The core of Agentic Ops is agent evaluation, and the key insight is that evaluation method must match task type. Not every agent task can — or should — be evaluated the same way.

The Agentic Ops Framework for evaluating AI performance across different task types.

For deterministic tasks, unit test-like benchmarking works well. For structured tasks, "diffing" with an NLP metric gives a measurable signal. For judgment-heavy tasks, LLM-as-judge is the emerging standard — scoring outputs across dimensions like prompt adherence, hallucination, and safety. For open-ended tasks, head-to-head human comparison is currently the best approach.

Across all task types, human-in-the-loop is a must: periodic expert spot-checks, and continuous product signals like thumbs up / down user feedback.

Data Science is the Right Home for Agentic Ops

Agentic Ops requires metric definition, experimental design, data gathering, and statistical reasoning — exactly the skills data science has spent years developing.

For data science leaders, the opportunity is real and the window is open: the agents are being built, but who governs them is still undecided. Placing data science here early is a high-leverage move. According to a recent Deloitte report, "Enterprises struggle to establish appropriate oversight mechanisms for systems designed to operate autonomously." This is an emerging organizational function that data science is primed to step in and contribute.

For individual contributors: go beyond building the agent. Stay for the evaluation in the long run. Partner with engineering and product teams to share your expertise, and instrument systems, define metrics, and build the feedback loops that keep agents honest.

People Ops emerged because organizations realized that managing people well was a strategic advantage. The same realization is coming for agents. Data science is well positioned to lead it.