Cloud Data Platforms Data Engineering Microsoft Fabric

Cosmos DB in Microsoft Fabric: Enterprise Architect’s Guide

Cosmos DB in Microsoft Fabric: Enterprise Architect’s Guide
Microsoft Fabric

Cosmos DB in Microsoft Fabric: What Enterprise Architects Need to Know

⏱️7 min read
👁️Microsoft Fabric · Cloud Data Platforms · Data Engineering
Cosmos DB in Microsoft Fabric — NoSQL database with vector search and automatic OneLake integration for agentic AI applications and enterprise analytics

Cosmos DB in Microsoft Fabric brings a managed NoSQL database with vector search, automatic OneLake integration, and unified Fabric billing to the same platform as your Lakehouses and Power BI semantic models.

Microsoft Fabric launched as an analytics-first platform — Lakehouses for data engineering, SQL databases for structured operational data, KQL Eventhouses for real-time streaming analytics, Notebooks for data science, and Power BI for reporting. All of it sharing OneLake storage and a single Fabric capacity billing model. What Fabric conspicuously lacked, until now, was a first-class NoSQL document database. The preview launch of Cosmos DB in Microsoft Fabric fills that gap — and for enterprise data architects already operating on Fabric, it changes the calculus for a specific and increasingly important category of workload: AI-native applications that need to combine document storage, vector search, and analytical access to enterprise data in a single platform, without stitching together Azure Cosmos DB, Azure AI Search, and Fabric as three separately managed services.

What Cosmos DB in Fabric Actually Is

Cosmos DB in Microsoft Fabric is a managed NoSQL database workload that runs within the Fabric platform — not a connector to Azure Cosmos DB, but a Fabric-native implementation built on the same underlying Azure Cosmos DB infrastructure. It supports the NoSQL API (the JSON document model with SQL-like query syntax), provides the same global distribution, high availability, and low-latency guarantees as Azure Cosmos DB, and adds a Fabric-specific integration: the database's data is automatically mirrored into OneLake in Delta Parquet format, making it instantly queryable from Fabric Lakehouses, Spark Notebooks, and Power BI without any ETL pipeline.

This positions Cosmos DB in Fabric not as a standalone database but as a transactional serving layer that is natively integrated into the broader Fabric analytics estate. An agentic AI application can write its session state, document records, and vector embeddings to Cosmos DB in Fabric; those same records are simultaneously available for batch analytics in a Lakehouse and for Power BI reporting through the semantic model — without a single pipeline job moving data between systems.

"Cosmos DB in Fabric is not just another database option — it is the operational serving layer that enterprise AI applications have been missing from the Fabric platform. The automatic OneLake mirroring is what makes it architecturally significant: transactional writes become analytical reads without any intervening pipeline."

Three Design Pillars: Simplified, Autonomous, AI-Optimised

Microsoft has described Cosmos DB in Fabric around three design principles that are relevant to enterprise adoption decisions.

Simplified setup. Creating a Cosmos DB database in Fabric requires only a name — no network configuration, no backup policy decisions, no throughput provisioning at creation time. The database starts within seconds with sensible defaults, and data can be loaded immediately via the Fabric data explorer UI (for JSON files) or via the Cosmos DB SDK (for programmatic inserts). For developers building AI applications who need a document store quickly, this is a materially lower barrier than provisioning a standalone Azure Cosmos DB account with its associated security, networking, and configuration decisions.

Autonomous scaling. During the preview period, Cosmos DB in Fabric automatically manages throughput scaling without developer intervention. Consumption is metered in Capacity Units (CUs) but is not billed during the preview — GA billing will align with Fabric's CU-based model, providing unified capacity reporting alongside Lakehouse and Spark workloads rather than a separate Azure Cosmos DB billing line.

AI-optimised capabilities. Cosmos DB in Fabric inherits Azure Cosmos DB's vector indexing, full-text search, and hybrid search capabilities. These are the specific features that make Cosmos DB a natural fit for retrieval-augmented generation (RAG) workloads — AI applications that need to retrieve contextually relevant documents from a knowledge base to ground language model responses in accurate, up-to-date enterprise data.

The OneLake Integration: Why It Changes the Architecture Equation

The most architecturally significant feature of Cosmos DB in Microsoft Fabric is not the database itself it is the automatic, zero-pipeline mirroring of Cosmos DB data into OneLake in Delta Parquet format. This single integration changes the design constraints for a large category of enterprise application architectures.

Before Cosmos DB in Fabric, a common enterprise data architecture pattern involved an operational layer (Azure Cosmos DB for application data, document storage, real-time serving) and a separate analytical layer (Fabric Lakehouse for batch analytics and BI), with a data pipeline between them — typically Azure Data Factory or an event-driven Cosmos DB change feed processor — moving data from the operational store into the Lakehouse on a scheduled or streaming basis. This pipeline adds latency, cost, and operational complexity: it is another component to monitor, version, and recover when it fails.

With Cosmos DB in Fabric, that pipeline disappears. The database writes are automatically reflected in OneLake as Delta Parquet, queryable from any Fabric Lakehouse shortcut, joinable with other OneLake data in Spark Notebooks, and available to Power BI through the semantic model. An application that writes a customer interaction record to Cosmos DB at 2pm has that record available for an analytics query in a Fabric Lakehouse at 2pm — not after the next nightly ETL run.

The reverse ETL pattern is also supported: analysed or enriched data from Spark Notebooks can be written back into Cosmos DB for low-latency application serving. A recommendation model that runs in a Fabric Notebook at midnight can write its output directly to Cosmos DB, where the application layer reads it with millisecond latency the following morning — with no intermediate serving database required.

AI and Agentic Application Use Cases

The vector indexing and hybrid search capabilities of Cosmos DB in Fabric make it specifically valuable for three AI application patterns that enterprise data teams are actively building.

Retrieval-Augmented Generation (RAG). RAG applications combine a language model with a knowledge retrieval mechanism — the model is grounded in relevant context retrieved from a document store rather than relying on its training data alone. Cosmos DB in Fabric's vector indexing stores the embedding representations of documents and retrieves the most semantically relevant records for a given user query. Combined with Fabric's integration with Azure OpenAI Service, the result is an enterprise RAG application where the knowledge base is a Cosmos DB database in Fabric, the embeddings are maintained in the same Cosmos DB container, and the analytical view of which documents are being retrieved (and whether the retrieval quality is sufficient) is available through OneLake to the BI and data science teams.

Agentic AI session state and memory. AI agent frameworks — including those using Microsoft's Semantic Kernel or AutoGen — require a persistent store for agent session state, conversation history, tool call logs, and multi-agent coordination records. Cosmos DB's document model (schema-flexible JSON) and low-latency read/write performance make it the natural session state store for agentic applications. Storing this state in Cosmos DB in Fabric rather than a standalone Cosmos DB account means the agent's operational data is automatically available for analytics — monitoring agent behaviour patterns, identifying failure modes, and analysing the quality of agent task completions — without a separate logging pipeline.

Real-time personalisation.Applications that serve personalised content, recommendations, or user-specific configurations at low latency read from Cosmos DB's millisecond-response serving layer. The analytics that inform those personalisation models — user behaviour analysis, content performance metrics, cohort segmentation — run in the Fabric Lakehouse and Notebook environment. With Cosmos DB in Fabric, the serving layer and the analytics layer share the same data in OneLake, closing the feedback loop between analytical model output and real-time application behaviour.

Security Architecture: Entra ID and Workspace Permissions

Cosmos DB in Fabric uses Microsoft Entra ID as its primary authentication mechanism, consistent with the rest of the Fabric platform. Workspace permissions defined in Fabric are automatically enforced on data plane requests to the Cosmos DB database — a user with Viewer permissions in the workspace has read-only access to the Cosmos DB data; a Contributor has read/write access. This workspace-level permission inheritance is a significant simplification compared to managing separate Cosmos DB account-level access policies and Fabric workspace permissions independently.

Service principal authentication is supported for programmatic access — applications connecting to Cosmos DB in Fabric from outside the browser use a service principal registered in Entra ID with appropriate workspace permissions, rather than requiring a dedicated Cosmos DB connection string or access key. Tenant-level Private Links are supported for network security, with workspace-level Private Link support planned for a future release. The database endpoint follows the standard Cosmos DB endpoint format, and existing Cosmos DB SDKs connect to it without modification.

Billing During Preview and What to Expect at GA

During the current preview period, Cosmos DB in Fabric usage is not billed — organisations can evaluate the feature and build against it without CU charges for the Cosmos DB workload itself. This is consistent with Microsoft's approach to other Fabric preview features and provides a meaningful window for enterprise adoption evaluation before budget commitments are required.

At general availability, Cosmos DB usage will be reported as Capacity Units and billed against the organisation's Fabric capacity — the same unified billing model that covers Lakehouse, Spark, SQL database, Eventhouse, and Power BI workloads. This means Cosmos DB in Fabric will consume from the same CU pool as other Fabric workloads and will be visible in the Fabric Capacity Metrics App alongside those workloads. For capacity governance — particularly for organisations already managing CU utilisation closely on existing workloads — planning for the Cosmos DB CU contribution before GA is a prudent preparation step. Our post on limiting capacity utilisation in Microsoft Fabric covers the controls available for managing that aggregate consumption.

When to Use Cosmos DB in Fabric vs Lakehouse, SQL Database, and Eventhouse

Adding Cosmos DB in Fabric to the platform creates a four-database-workload landscape within Fabric. Understanding when each workload is the right choice — rather than treating them as interchangeable — is the core architectural decision that data teams need to make.

Use Cosmos DB in Fabric when your workload requires low-latency (millisecond) read/write access to semi-structured or document data from an application layer, when you need vector search or hybrid search for AI/RAG workloads, when the data model is schema-flexible and evolves rapidly (e.g. AI agent state, session records, event payloads), or when you need the operational serving layer and the analytics layer to share data without a pipeline between them.

Use Fabric Lakehouse when your workload is primarily analytical — batch processing, historical analysis, data science, ETL — and does not require low-latency application serving. The Lakehouse is the right home for large-volume historical data, complex transformations, and the governed Gold layer that Power BI reads from. It is not the right home for operational data that applications need to write to and read from at millisecond latency.

Use Fabric SQL Database when your workload requires relational data with ACID transactions, complex joins, and structured schema enforcement. The SQL database in Fabric is the right choice for operational relational workloads — order management, financial transaction processing, inventory — where the relational model provides integrity guarantees that Cosmos DB's document model does not.

Use Fabric Eventhouse (KQL). When your workload involves high-throughput time-series data, IoT telemetry, log analytics, or any scenario where data arrives as a continuous stream and needs to be queried with time-based and pattern-based KQL queries, Eventhouse is the right workload. Cosmos DB is not designed for high-throughput stream ingestion; Eventhouse is.

Fabric Database Workload Comparison

Workload Data Model Latency Profile Primary Use Case AI Capabilities
Cosmos DB in Fabric JSON documents (NoSQL API) Milliseconds (operational) AI apps, RAG, agent state, real-time serving Vector indexing, full-text, hybrid search
Fabric Lakehouse Delta Parquet (files + tables) Seconds–minutes (analytical) Batch ETL, historical analytics, ML training Spark ML, Notebooks, model training datasets
Fabric SQL Database Relational (tables, schemas) Milliseconds–seconds Operational relational workloads, ACID transactions Limited — relational model only
Fabric Eventhouse (KQL) Time-series / event streams Milliseconds (query) IoT, telemetry, log analytics, real-time intelligence Anomaly detection, pattern queries

Enterprise Adoption Considerations and Next Steps

For enterprise data architects evaluating Cosmos DB in Microsoft Fabric, the preview period is the right time to build a proof of concept against a representative AI application use case — specifically a RAG application or an agent session state store — and validate the OneLake mirroring latency, the vector search query performance, and the SDK connectivity behaviour in the organisation's network environment.

The preview's no-billing window also provides an opportunity to assess the CU impact of the Cosmos DB workload on the existing Fabric capacity before GA billing is activated. Running a representative workload load test during preview and monitoring the Fabric Capacity Metrics App for CU consumption patterns gives the data engineering team the baseline they need to plan capacity for GA without surprises.

For organisations that are not yet on Microsoft Fabric but are evaluating the platform, Cosmos DB in Fabric strengthens the case for Fabric as a complete data platform rather than a BI-and-analytics overlay — it now covers operational document workloads that previously required a separate Azure service. If your team is assessing the Fabric platform for enterprise adoption or designing the architecture for an AI application programme that needs to combine document storage, vector search, and analytical BI, speak with a certified Microsoft Fabric consultant at Numlytics. For the capacity management context relevant to adding new Fabric workloads, see our posts on limiting capacity utilisation in Microsoft Fabric and Microsoft Fabric capacity overage.