How to Set Up ClickHouse and Metabase with Docker Compose
ClickHouse + Metabase via Docker Compose — an open-source OLAP analytics stack that handles billions of rows and gives every business team self-serve dashboards, deployable in under 30 minutes.
Most analytics infrastructure conversations in the enterprise Microsoft ecosystem start and end with Power BI and Fabric — and for good reason. But there is a category of use case where a lightweight, open-source OLAP stack is the faster, simpler, and more cost-effective path: a startup building its first analytics capability, a data engineering team prototyping an analytics layer before committing to a managed cloud platform, or an organisation that needs high-performance columnar analytics on a constrained infrastructure budget. The ClickHouse and Metabase Docker combination addresses exactly this scenario — a columnar OLAP database that queries billions of rows in seconds, paired with a self-serve BI tool that non-technical business users can operate without analyst support, deployable on any machine with Docker installed in under 30 minutes.
Why ClickHouse and Metabase Together
ClickHouse and Metabase solve two complementary problems. ClickHouse solves the query performance problem — it is a column-oriented OLAP database designed specifically for analytical workloads, capable of scanning and aggregating billions of rows in seconds on modest hardware through its columnar storage, vectorised execution engine, and aggressive compression. Metabase solves the distribution problem — it is an open-source BI tool with a question-and-dashboard interface that business users can operate without SQL knowledge, an admin interface that data teams can manage, and a clean connection layer to ClickHouse and dozens of other data sources.
Neither tool requires a cloud account, a managed service subscription, or per-seat licensing in their open-source form. Running both on a single server or on a developer's laptop via Docker takes minutes and produces a working analytics stack — one that can handle event analytics, log analysis, product analytics, and operational dashboards at scales that would require significantly larger infrastructure on traditional row-oriented databases.
"ClickHouse with Metabase is the analytics stack that eliminates the backlog. Business teams stop waiting for dashboards to be built because they can build their own. Data engineers stop maintaining bespoke report queries because ClickHouse makes ad hoc aggregation fast enough to run live."
What ClickHouse Is and Why It's Fast
ClickHouse is an open-source column-oriented database management system originally developed at Yandex and now maintained by ClickHouse Inc. Its architecture is specifically designed for analytical workloads that require scanning large amounts of data and aggregating it — the type of query that runs slowly on row-oriented databases like PostgreSQL or MySQL regardless of how well the query is optimised.
The performance advantage comes from four architectural properties working together. Columnar storage means that a query reading only three columns from a 50-column table reads only those three columns' data from disk, not the full row width — dramatically reducing I/O for analytical queries. Vectorised execution processes batches of column values together using CPU SIMD instructions rather than processing one row at a time. Aggressive compression (ClickHouse achieves 6–10× compression ratios on typical analytical data) means more data fits in memory and cache. And native parallelism across CPU cores means that a complex aggregation query automatically uses all available CPU cores without configuration.
The result is a database where queries that take minutes on PostgreSQL consistently run in seconds or milliseconds on ClickHouse for the same data, even on modest hardware. A 1 billion row event table that takes 45 seconds to aggregate on a well-indexed PostgreSQL instance will typically aggregate in under 1 second on ClickHouse on the same hardware.
What Metabase Is and Who It Serves
Metabase is an open-source business intelligence tool with two key design priorities: making it easy for non-technical business users to explore data without writing SQL, and making it quick for data teams to deploy and maintain without a dedicated BI platform engineering investment. Its Question builder interface lets business users select a table, apply filters, choose a visualisation type, and save the result as a dashboard card — without SQL. Its SQL editor gives data analysts the option to write raw SQL queries when needed. Its dashboard builder assembles Question results and SQL-based charts into shareable, refreshable dashboards.
Metabase's connection to ClickHouse is handled through a community-maintained JDBC driver, and the combination is well-tested for production use. Metabase exposes ClickHouse tables, runs user queries against ClickHouse, caches results where configured, and serves dashboard results to any authenticated user — with a permission model that controls which users can see which data and which databases.
Prerequisites Before You Begin
The setup requires Docker and Docker Compose installed on the host machine. Docker Desktop (for macOS and Windows) includes both; Linux installations require Docker Engine and the Docker Compose plugin installed separately. Verify the installation by running docker --version and docker compose version in a terminal — both should return version numbers without errors. A minimum of 4 GB of RAM available to Docker is recommended; 8 GB allows comfortable operation with sample datasets of tens of millions of rows.
Step 1 — The Docker Compose File
Create a project directory and save the following docker-compose.yml file in it. This configuration starts ClickHouse and Metabase as networked services with persistent data volumes, so data survives container restarts.
version: "3.8"
services:
clickhouse:
image: clickhouse/clickhouse-server:latest
container_name: clickhouse
ports:
- "8123:8123" # HTTP interface (used by Metabase JDBC)
- "9000:9000" # Native TCP interface (clickhouse-client)
volumes:
- clickhouse_data:/var/lib/clickhouse
environment:
CLICKHOUSE_DB: analytics
CLICKHOUSE_USER: default
CLICKHOUSE_PASSWORD: "" # Empty for local dev; set in prod
CLICKHOUSE_DEFAULT_ACCESS_MANAGEMENT: 1
ulimits:
nofile:
soft: 262144
hard: 262144
networks:
- analytics_net
metabase:
image: metabase/metabase:latest
container_name: metabase
ports:
- "3000:3000"
environment:
MB_DB_TYPE: h2 # Built-in H2 for dev; use Postgres in prod
JAVA_TIMEZONE: UTC
depends_on:
- clickhouse
networks:
- analytics_net
volumes:
clickhouse_data:
networks:
analytics_net:
driver: bridge
Two notes on this configuration. First, the ClickHouse password is left empty for local development convenience — in any non-local environment, set a strong password in both the CLICKHOUSE_PASSWORD environment variable and in the Metabase connection settings below. Second, Metabase uses its built-in H2 database to store its own metadata (questions, dashboards, user accounts) in this configuration — for production use, replace H2 with a PostgreSQL instance as Metabase's application database to avoid data loss if the Metabase container is rebuilt.
Step 2 — Start the Services
From the project directory containing the docker-compose.yml file, run the following command to start both services in detached mode:
docker compose up -d
Docker pulls the ClickHouse and Metabase images on the first run (this takes 2–5 minutes depending on connection speed) and starts both containers. On subsequent runs, the images are already local and both services start in seconds. Monitor the startup logs with docker compose logs -f — ClickHouse is ready when the logs show "Application: Ready for connections" and Metabase is ready when the logs show "Metabase Initialization COMPLETE".
Step 3 — Verify ClickHouse Is Running
Verify the ClickHouse HTTP interface is responding by opening http://localhost:8123 in a browser — a plain text response of "Ok." confirms ClickHouse is up. To run queries interactively, connect via the ClickHouse HTTP interface using curl or via the ClickHouse client built into the container:
# HTTP interface check curl http://localhost:8123 # Connect via ClickHouse client in the container docker exec -it clickhouse clickhouse-client # Run a test query inside the client SELECT version();
Step 4 — Load Sample Data into ClickHouse
Create a sample events table and load some data to verify the stack end-to-end. The following SQL creates a simple clickstream events table using ClickHouse's MergeTree engine — the standard table engine for most analytical use cases — and inserts sample rows:
-- Connect via: docker exec -it clickhouse clickhouse-client
CREATE TABLE IF NOT EXISTS analytics.events
(
event_id UInt64,
event_date Date,
event_type LowCardinality(String),
user_id UInt32,
session_id UInt64,
country LowCardinality(String),
revenue Decimal(10, 2)
)
ENGINE = MergeTree()
PARTITION BY toYYYYMM(event_date)
ORDER BY (event_date, event_type, user_id);
-- Insert sample rows
INSERT INTO analytics.events VALUES
(1, '2026-01-01', 'page_view', 1001, 10001, 'GB', 0.00),
(2, '2026-01-01', 'purchase', 1001, 10001, 'GB', 49.99),
(3, '2026-01-02', 'page_view', 1002, 10002, 'US', 0.00),
(4, '2026-01-02', 'sign_up', 1002, 10002, 'US', 0.00),
(5, '2026-01-03', 'purchase', 1003, 10003, 'AU', 99.00);
-- Verify
SELECT event_type, COUNT(*) as cnt, SUM(revenue) as total_revenue
FROM analytics.events
GROUP BY event_type
ORDER BY cnt DESC;
Step 5 — Connect Metabase to ClickHouse
Open Metabase at http://localhost:3000. On first access, Metabase runs a setup wizard — create an admin account with your email and a password, then proceed to the database connection step.
To add ClickHouse as a database in Metabase, navigate to Settings → Admin → Databases → Add a database. Select ClickHouse from the database type dropdown (the ClickHouse driver is included in recent Metabase versions; if not listed, it can be added as a plugin JAR from the ClickHouse Metabase driver releases). Enter the connection details:
Database type: ClickHouse Display name: ClickHouse Analytics Host: clickhouse (Docker service name — not localhost) Port: 8123 Database name: analytics Username: default Password: (leave blank for local dev config above) Use a secure connection (SSL): OFF (for local dev)
Click Save. Metabase runs a connection test — on success, the ClickHouse database appears in the left panel when creating a new Question. Navigate to New → Question, select the ClickHouse Analytics database, pick the events table, and the question builder loads the table's columns for exploration. Business users can now filter by country, group by event_type, visualise revenue as a bar chart, and save the result to a dashboard — no SQL required.
When to Use This Stack vs Microsoft Fabric and Power BI
| Consideration | ClickHouse + Metabase (Docker) | Microsoft Fabric + Power BI |
|---|---|---|
| Licence cost | Open source — no licence cost for core stack | Fabric capacity + Power BI licences required |
| Setup time | Under 30 minutes with Docker Compose | Hours to days for full enterprise setup |
| Query performance at scale | Excellent — sub-second on billions of rows | Excellent — Fabric Lakehouse + DirectLake |
| Enterprise governance | Manual — no built-in sensitivity labels, RLS governance tooling | Full — Purview integration, RLS, certifications, audit logs |
| Self-serve analytics | Good — Metabase question builder for non-SQL users | Excellent — Power BI Copilot, Explore, certified datasets |
| Microsoft ecosystem integration | Custom connectors required | Native — Teams, SharePoint, Azure AD, Office 365 |
| Best for | Startups, prototypes, event analytics, cost-constrained deployments | Enterprise BI, regulated industries, multi-source cross-functional analytics |
- ClickHouse is a columnar OLAP database that achieves sub-second query performance on billions of rows through columnar storage, vectorised execution, aggressive compression, and native parallelism — all on modest hardware.
- Metabase is an open-source BI tool with a no-SQL question builder for business users, a SQL editor for analysts, and a dashboard builder — making the ClickHouse query performance accessible to non-technical audiences.
- The Docker Compose configuration starts both services networked together with persistent volumes in a single command — Metabase connects to ClickHouse using the Docker service name as the host, not localhost.
- ClickHouse's MergeTree table engine is the correct engine for most analytical use cases — the PARTITION BY and ORDER BY clauses determine query performance for time-series and filtered analytical queries.
- For production deployments, replace Metabase's H2 application database with PostgreSQL and set a strong ClickHouse password — the Docker Compose configuration above is calibrated for local development and evaluation.
- This open-source stack is best suited for startups, prototypes, event analytics, and cost-constrained deployments — for enterprise analytics requiring governance, compliance, and Microsoft ecosystem integration, Power BI and Microsoft Fabric remain the appropriate choice.
Next Steps and Production Considerations
Once the ClickHouse and Metabase Docker stack is running locally and validated with sample data, three steps make it production-ready for team use. First, add a PostgreSQL service to the Docker Compose file as Metabase's application database — Metabase's H2 built-in database is not suitable for production because it does not support concurrent connections well and loses data if the container is rebuilt without a volume. Second, set a strong ClickHouse password and configure Metabase's connection to use it — ClickHouse's HTTP interface is accessible on port 8123 and should not be exposed publicly without authentication. Third, consider reverse proxy configuration (Nginx or Traefik in Docker) to serve Metabase at a domain name with HTTPS rather than directly on port 3000.
For teams evaluating whether to build on this open-source stack long-term or to migrate to a managed analytics platform as the organisation scales, the decision point is typically governance complexity and cross-system data integration requirements. When those requirements grow beyond what ClickHouse and Metabase manage natively — sensitivity label enforcement, Power BI Copilot for business users, integration with Microsoft 365, or enterprise audit logging — Microsoft Fabric and Power BI become the more appropriate foundation. For guidance on that transition, see our post on the SAP to Microsoft Fabric integration and our overview of Fabric Data Pipelines and Fast Copy.
If your organisation is evaluating analytics infrastructure options — from open-source stacks through to enterprise Microsoft Fabric deployments — speak with a Numlytics data engineering consultant to discuss the right architecture for your scale, governance requirements, and team capabilities.