How ClickHouse Query Profiler works ?

chistadata
Posted on February 27, 2022

How does ClickHouse Query Profiler work?


The core objective of ClickHouse Query Profiler is to measure the performance metrics of SQLs and understand where the application is spending most of the time, This helps to successfully complete ClickHouse performance audit/diagnostics/forensics by tracing CPU time against wall-clock time including idle time. At ChistaDATA we spend a considerable amount of time troubleshooting (of course lots of research included) ClickHouse performance and we are committed to building optimal & reliable ClickHouse infrastructure for customers worldwide.

Configuring ClickHouse Query Profiler

ClickHouse trace_log system table records the profiler operations, This setting is configured by default. But, The data on this table is valid only for a running server so after the ClickHouse server restarts, ClickHouse does not clean up this table and so all the persistent virtual memory addresses will become invalid. We have copied below the default server configuration file (config.xml) for your reference:

<trace_log>
    <database>system</database>
    <table>trace_log</table>
    <partition_by>toYYYYMM(event_date)</partition_by>
    <flush_interval_milliseconds>7500</flush_interval_milliseconds>
</trace_log>

[table id=9 /]

How to configure query_log and other _log tables for set up TTL, and/or some other cleanup procedures:

cat /etc/clickhouse-server/config.d/query_log.xml
<yandex>
    <query_log replace="1">
        <database>system</database>
        <table>query_log</table>
        <flush_interval_milliseconds>7500</flush_interval_milliseconds>
        <engine>
ENGINE = MergeTree
PARTITION BY event_date
ORDER BY (event_time)
TTL event_date + interval 90 day
SETTINGS ttl_only_drop_parts=1
        </engine>
    </query_log>
</yandex>

The configuration parameters query_profiler_real_time_period_ns and query_profiler_cpu_time_period_ns dump stacktraces all the threads which execute the query. By default, it collects information only about queries when runs longer than 1 sec (and collects stacktraces every second).

To record how RAM is used by ClickHouse queries you can configure the setting memory_profiler_sample_probability