AI Agents Can Now Handle Enterprise Data Safely

TL;DR

A new infrastructure design lets AI agents work on sensitive data without errors or security risks, removing the trust barriers blocking adoption.

AI agents have advanced rapidly in recent years, but most companies still hesitate to let them handle production data due to concerns about trust and governance. In a new position paper, researchers argue that the key to making AI agents trustworthy in enterprise settings lies not in improving the agents themselves, but in redesigning the underlying data infrastructure. They propose a system called Bauplan, which reimagines the lakehouse—a standard cloud platform for data and AI workloads—to safely accommodate the unique access patterns of AI agents. This approach addresses a critical bottleneck: as AI capabilities grow, infrastructure limitations have prevented agents from being deployed in sensitive data environments, where errors like dropping tables or polluting data with hallucinations could have costly consequences.

The researchers found that by focusing on infrastructure first, they could enable correct concurrent workloads for multiple AI agents while ensuring data integrity. Their core insight is that if infrastructure guarantees safe isolation of data and compute through a unified application programming interface (API), governance becomes straightforward through API-based access control. This mirrors successful patterns from database systems, specifically multi-version concurrency control (MVCC), which manages multiple users by giving each the illusion of being alone through transactions. However, directly applying MVCC to lakehouses fails because lakehouses are distributed, multi-language systems with decoupled storage and compute, unlike monolithic databases. Bauplan adapts these principles by introducing novel abstractions for data isolation, compute isolation, and programming, tailored to the agentic lakehouse environment.

To achieve this, ology involves three main components. First, for data isolation, Bauplan uses immutable snapshots and a branching mechanism similar to Git, where each write is recorded as a commit with a parent, allowing efficient modeling of concurrent changes across multiple tables. This ensures that pipelines—sequences of data transformations—can be handled atomically, preventing inconsistent states if an agent fails mid-process. Second, for compute isolation, Bauplan adopts a Function-as-a-Service (FaaS) model, where each function runs in a containerized environment with its own Python or SQL engine, isolated from other functions and the internet. This sandboxes agent code to prevent malicious actions. Third, for programming abstractions, Bauplan provides declarative APIs that agents can use to chain functions together, with I/O mediated by the platform rather than direct file access, simplifying correctness and enabling performance optimizations.

Show that Bauplan successfully enables transactional pipelines across heterogeneous compute environments, as illustrated in Figure 2. Without temporary branches, a two-node pipeline failing after one node leaves the main data branch in an inconsistent state, even if individual table transactions succeed. With Bauplan's run API, which opens a temporary branch for each pipeline execution, atomic writes are guaranteed on success, and isolation is maintained on failure. This ensures that downstream readers are protected from inconsistent data. Additionally, the system supports self-healing pipelines, as shown in Figure 3, where an agent can attempt to fix a failed run within a ReAct loop, with all writes occurring on a data branch and human verification required before merging to production. The declarative APIs also facilitate governance by allowing platform-level checks, such as whitelisting packages in decorators, reducing attack vectors.

Of this work are significant for industries relying on data engineering and AI. By solving concurrency and correctness at the infrastructure level, Bauplan provides a principled path to governance, making it easier for enterprises to deploy AI agents safely. This could accelerate adoption in areas like data pipeline management, where agents can automate complex tasks without risking data corruption or security breaches. The researchers emphasize that the bottleneck to scaling trustworthy data engineering is no longer AI intelligence but infrastructure, and Bauplan offers a concrete solution that bridges the gap between agentic capabilities and production readiness.

However, the paper acknowledges limitations. Bauplan is a position paper with a reference implementation, and its principles are tool-agnostic, but real-world deployment may face s such as integration with existing lakehouse systems or performance overhead from branching mechanisms. The researchers note that the landscape is evolving quickly, and further work is needed to validate the system at scale and across diverse use cases. Additionally, while Bauplan addresses governance through API-based control, it assumes agents operate within defined boundaries; unforeseen agent behaviors or adversarial attacks might require additional safeguards. The paper concludes by advocating for correctness by construction in agent-first lakehouses, suggesting that future research should explore extensions to other AI workloads and broader ecosystem compatibility.

About the Author

Guilherme A.

Former dentist (MD) from Brazil, 41 years old, husband, and AI enthusiast. In 2020, he transitioned from a decade-long career in dentistry to pursue his passion for technology, entrepreneurship, and helping others grow.

Connect on LinkedIn