Published on

Part 2 - Getting Started: Setting Up ClickHouse for Observability

Authors

Introduction

In our previous post, we explored why ClickHouse is a powerful choice for observability. Now it's time to get hands-on. This post guides you through setting up ClickHouse from scratch using Docker, with a focus on preparing it for high-throughput log ingestion.

Whether you're experimenting locally or laying the foundation for a multi-node deployment, this setup provides a reliable starting point.


Deployment Options

ClickHouse supports several deployment options depending on your environment and goals:

MethodBest Use Case
Binary (.tar.gz)Bare metal servers with full control
DockerQuick local setup and dev testing
KubernetesScalable production environments from Altinity Kubernetes Operator
Cloud ServicesFully managed setup via ClickHouse Cloud or Altinity.Cloud

For this guide, we’ll focus on Docker, which is fast to set up and ideal for development and prototyping.


Running ClickHouse via Docker Compose

Inspired from ClickHouse's Example repo, created the following repo for the full e2e setup.

Assuming you have a docker setup, once you cloned the repository, just run:

docker compose up -d

The above will setup two services:

  • ClickHouse Server with 1 Shard and 1 Replica
  • ClickHouse Keeper

Once the docker compose is up and running, you can access ClickHouse Play Interface through http://localhost:8123/play.

  • 8123: HTTP Interface
  • 9000: ClickHouse Natvie Protocol port

Sanity Check: Insert and Query Logs

Load up the play interface and try this:

CREATE TABLE test_logs (
  timestamp DateTime,
  message String
) ENGINE = MergeTree
ORDER BY timestamp;

INSERT INTO test_logs VALUES (now(), 'Service started');

SELECT * FROM test_logs ORDER BY timestamp DESC;

This validates that ingestion and querying are working as expected.

Monitoring ClickHouse Health

ClickHouse exposes system-level observability via:

  • System tables like system.metrics, system.events, system.parts.
  • /ping and /metrics endpoints
  • Integration with Prometheus

Example:

SELECT * FROM system.metrics WHERE value > 0;

In later posts, we’ll cover how to integrate these metrics into dashboards.

What's Next?

You now have a ClickHouse instance running and ready to ingest logs. In the next post, we'll build a production-grade ingestion pipeline using:

  • Fluent Bit / Splunk Universal Forwarder
  • OpenTelemetry Collector
  • Kafka (optional but recommended)

Our architecture / workflow is going to be like this:

ClickHouse Workflow Diagram

We’ll also explore schema design for log data that balances performance, compression, and query speed.

Stay tuned for the next part in the series and do share your feedback / ask any questions below in the comments section.