Skip to main content

ClickHouse

ClickHouse is a fast, open-source column-oriented database management system. Configuring d8a to use ClickHouse is straightforward and requires no external setup.

Configuration

Tip

Full configuration reference is available here.

Add the following to your config.yaml file:

warehouse:
driver: clickhouse
clickhouse:
host: localhost
port: "9000"
database: d8a
username: default
password: "your-password"

Important notes

  • Engine Support: d8a currently supports only the MergeTree engine. Tables are created with ENGINE = MergeTree().
  • Distributed/Replicated Setups: Distributed and Replicated table setups are not supported at the moment. Use a single ClickHouse instance or request this feature in GitHub issues.
  • Nullability: Nullable columns from the schema in Clickhouse are stored as NOT NULL with DEFAULT. This avoids Nullable(T) storage overhead while preserving semantic nullability. Missing or nil values are automatically converted to type-specific defaults (e.g., '' for strings, 0 for numbers, '1970-01-01' for dates).

Metadata

ClickHouse-specific optimizations can be applied via Arrow field metadata (this information is usable for developers implementing new columns, currently this cannot be controlled from configuration):

Metadata KeyValueDescription
clickhouse.low_cardinality"true"Wraps the column type as LowCardinality(T) for improved storage efficiency and query performance on columns with low cardinality. Applies to any type supported by Clickhouse;
clickhouse.codec"CODEC(Delta, ZSTD)"Appends a CODEC clause to the column definition in ClickHouse DDL. The value should be a complete CODEC clause (e.g., CODEC(Delta, ZSTD) or CODEC(Delta)). Use the meta.Codec(codec, compressionAlg) helper function to generate the value. Codec metadata does not affect semantic schema compatibility checks.

Verifying your setup

After configuring ClickHouse, start d8a and check the logs. You should see messages indicating successful connection to ClickHouse.