Writing

Building Real-Time Multi-Agent AI With Confluent

How Agent Taskflow uses Confluent's data streaming platform as the foundation of its event-driven multi-agent orchestration system.

Author Saul Sparber
Published Apr 23, 2025
Read Time 7 min

This post was originally published on the Confluent blog. Read the original on confluent.io →

We're entering a new era of artificial intelligence, where intelligence isn't just reactive; it's orchestrated. At Agent Taskflow, we're pioneering a new class of systems: multi-agent orchestration platforms. These systems empower teams of AI agents to coordinate, think, reason, and act in concert -- just like human teams.

But building these systems at scale requires something most AI platforms overlook: real-time, observable, fault-tolerant communication. That's why we've built Agent Taskflow on the Confluent data streaming platform, unlocking the power of cloud-native Apache Kafka, connectors, Stream Governance, and more.

In this post, I'll share why we chose Confluent, how it powers our multi-agent platform, and the real-world impact it's already delivering for our team and customers.

What Is Agent Taskflow?

Agent Taskflow is an AI orchestration platform designed to make multi-agent systems (MASs) accessible and usable by anyone. With a drag-and-drop builder, real-time messaging backbone, and native memory graph, it provides users with:

Our vision is simple but powerful: Make useful, affordable, and fun AI agents accessible to everyone. But we're thinking far beyond single agents or even agent groups. We believe the entire future of software is agent-native.

Agent Taskflow is positioned to own this transition with an entire suite of agent-native apps and agent developer tools, including SDKs and public APIs. We want to become the default operating system for multi-agent orchestration -- a system where any individual or enterprise can deploy intelligent agent teams to handle repetitive work, make decisions, and deliver insights.

Why Multi-Agent Systems Matter for Enterprises

Multi-agent systems are networks of intelligent agents that interact, share context, and collaborate to solve complex problems. Agents will drive a new era of automation, which can deliver greater cost savings, improve customer experiences through faster response times, and unlock new revenue opportunities.

In the enterprise, multi-agent systems enable use cases such as:

MASs let organizations move from isolated AI tools to end-to-end AI workflows that are autonomous, real-time, and accountable.

These aren't hypothetical scenarios. We've already built flows like these with real clients, helping them replace clunky, multi-tool handoffs with seamless, agent-led automation. For example, one healthcare client now uses an agent pod to sanitize medical transcripts in real time, personalize content by audience, and pass final assets to marketing -- all without human handoffs.

The Enterprise Risk Factor: Why Multi-Agent Systems Need Governance

While the benefits of multi-agent systems are substantial, they also introduce exponential risk compared to single-agent deployments. If human error introduces compliance and security challenges, autonomous AI agents can dramatically multiply these concerns.

Enterprises adopting multi-agent systems face several critical risks:

This is why enterprises need a comprehensive platform for real-time agent orchestration, observation, and governance. Without these safeguards, enterprises risk creating "shadow AI" that operates outside of established governance frameworks.

Technical Challenges in Building Multi-Agent Systems

To help our customers build effective multi-agent systems, we had to address four key technical challenges:

Multi-Agent Communication

Agents must share state, pass messages, and coordinate execution. Without a consistent stream of structured events, agents act out of order, context is lost, and failures cascade across the system. What makes this particularly challenging is the need for real-time interactivity. Users want to see agents thinking, reasoning, and working -- not just the final output.

Observability

We don't just want to know if something failed -- we want to know why. That requires:

Each agent action generates events across multiple planes. Without a unified event backbone, tracking and debugging becomes nearly impossible.

We built our entire system event-first because of these challenges. Every action, thought, and decision is an event first.

Fault Tolerance and Scalability

Multi-agent orchestration is compute-heavy and stateful. Our system must:

Identity and Permissioning

Each agent must be aware of:

Why We Chose Confluent

Let me be candid: I've been a data engineer for over a decade. I've scaled Kafka clusters myself. I know how to do it. But that doesn't mean I want to spend my time doing it -- especially as a startup founder.

We evaluated multiple data streaming and messaging platforms. Confluent stood out because it let us:

We chose Confluent not just because it was easier but because it was the only platform that matched our velocity and standards for safety at scale.

The team at Confluent has been first-rate. Through the AI Accelerator Program, they helped us rearchitect our entire event schema -- reducing costs, improving scalability, and delivering unmatched observability for agentic activity. Their expertise and hands-on feedback validated our architecture and accelerated our development.

Agent Taskflow's Streaming Architecture

Using the Confluent data streaming platform, our architecture is structured into three major planes, each represented in our Kafka-based data architecture:

1. Control Plane

  • Responsible for CRUD operations, permissions, licensing, metadata
  • Agent and flow configurations
  • Tasks, control events, marketplace events
  • Schema: ControlEvent, AgentConfig, FlowConfig, BilingEvent

2. Data Plane

  • The runtime core: what agents do, what flows run, how state gets updated
  • Tracks execution events, chat events, embedding events, orchestrating events
  • Schema: ExecutionEvent, ChatEvent, EmbeddingEvent, FlowEvent

3. Aggregate Plane

  • High-level derived events for streaming, notification, and UI sync
  • Notifications, audit log
  • Schema: AuditLogEntry, NotificationPayload, DashboardMetric

Each event is typed, traceable, and replayable, providing robust observability and fault tolerance out of the box.

This architecture -- where each plane corresponds to a Kafka topic namespace -- enables the real-time responsiveness that makes Agent Taskflow feel alive. This decoupled, event-driven approach allows us to scale teams and observability independently. When you chat with an agent, you can see it thinking in real time, watch flow steps running, get notified when it's awaiting feedback, and observe as it dynamically renames the chat based on the conversation.

All of this is powered by structured events flowing through Confluent. We've even implemented RAG, where events in topics are vectorized and stored in Qdrant. During agent conversations or flows, we run similarity search and inject relevant "memories" or documents into the agent's context window.

How We Use the Confluent Data Streaming Platform Today

Every use case on our platform runs on Confluent because our entire runtime is event-driven. Confluent enables our multi-agents to:

Each of these agents subscribes to real-time event streams and coordinates through shared Kafka topics -- data streaming is the shared language of agents.

We've integrated Confluent products deeply into our platform:

Connectors

Stream Governance

Benefits We've Seen

What's Next

Using Confluent, we're building an agent marketplace for users to share and monetize flows, agents, and data assets. We're building a local model interface for running local LLMs, a suite of agent-native apps, an identity layer for policy enforcement, and a lightweight SIEM product for auditing agent behavior through stream analytics.

Streaming will remain our backbone -- every action and insight starts as an event.

If you're building enterprise AI, real time isn't optional -- it's foundational. At Agent Taskflow, we believe agents are collaborators, not tools. Building multi-agent systems is hard -- but Confluent makes it possible.

Read more

Check out the original post on Confluent, or visit Agent Taskflow to see the platform in action.

Original on Confluent → Visit atf.ai →