This report turns the 2026-05-12 Bank HA runner into a public engineering note.

The run used a three-node Kafka 4.2 KRaft cluster with replication factor 3 and one partition per benchmark topic. It is a measurement for this hardware, Incus placement, workload, and configuration. It is not a general product throughput ceiling.

Download the result package: CSV summaries, tx sequence integrity data, the original report, and generated charts.

Test setup

The run used an Incus-based environment on one physical host:

  • CPU: AMD Ryzen 9 9950X3D, 16 cores / 32 threads.
  • Memory: 249 GiB visible to Linux.
  • Coordination: 3 etcd containers.
  • Log and publication path: 3 Kafka 4.2 KRaft brokers, RF=3, one partition per benchmark topic.
  • StateVec Bank runtime: one leader container and one standby container.
  • Workload: mixed_bank_flow from bank_ha_host_runner.

The storage was ordinary consumer NVMe, split by role:

  • infra containers on Lexar NM1090 PRO;
  • Bank leader on Crucial T710;
  • Bank standby on WD_BLACK SN8100.

The container sizing was intentionally modest:

  • Kafka brokers: 3 containers, each 2C / 8GiB.
  • etcd: 3 containers, each 1C / 1GiB.
  • Bank leader and standby: each 4C / 16GiB.

The benchmark path was:

host runner
    |
    | commands
    v
Kafka RF=3 command topic
    |
    v
Bank leader
    |
    | committed TxResult / TxLog
    v
Kafka RF=3 txlog topic
    |
    v
Bank standby durable + apply

Bank leader -> Kafka RF=3 publication topic -> host publication observer

Each benchmark case used a distinct run id and distinct command, replication, and publication topics.

What was measured

The report keeps four signals separate:

  • Measured TPS: average command throughput over measured windows.
  • Publication latency: sampled event sent -> publication observed latency.
  • Tx sequence completeness: leader committed tx, standby durable tx, and standby applied tx all reach the expected sequence.
  • Standby lag p99: sampled p99 standby replication lag in transactions.

The low-latency target for this run was:

  • p95 publication latency < 7 ms;
  • p99 publication latency < 10 ms;
  • tx sequence completeness must pass.

Each case used one warmup window and five measured windows. Warmup is excluded from the summaries.

Throughput and latency

Measured TPS
Measured TPS
Publication latency
Publication latency

Compression none

TargetAvg TPSp50 msp95 msp99 msp95<7p99<10tx_seq
5,0004,9755.176.426.74yesyesyes
10,0009,9565.286.216.43yesyesyes
20,00019,9105.396.316.55yesyesyes
50,00049,7875.7466.8084.01nonoyes
100,00099,1055.836.8558.15yesnoyes
150,000148,0334.825.7874.20yesnoyes

Compression snappy

TargetAvg TPSp50 msp95 msp99 msp95<7p99<10tx_seq
5,0004,9755.056.356.68yesyesyes
10,0009,9565.386.336.62yesyesyes
20,00019,9015.536.456.78yesyesyes
50,00049,7825.606.507.81yesyesyes
100,00099,2995.926.787.33yesyesyes
150,000148,3604.9547.7276.31nonoyes

Compression lz4

TargetAvg TPSp50 msp95 msp99 msp95<7p99<10tx_seq
5,0004,9735.316.496.77yesyesyes
10,0009,9585.376.306.60yesyesyes
20,00019,9115.586.496.85yesyesyes
50,00049,7905.666.60134.46yesnoyes
100,00099,2445.926.777.37yesyesyes
150,000147,9664.975.838.19yesyesyes

Tx sequence completeness

All 18 Kafka matrix runs passed the tx sequence completeness gate. The gate checks:

  • leader_committed_tx_seq >= expected_tx_seq
  • standby_durable_tx_seq >= expected_tx_seq
  • standby_applied_tx_seq >= expected_tx_seq
CompressionTargetExpectedLeaderStandby durableStandby appliedResult
none150,00018,000,10018,000,10018,000,10018,000,100yes
snappy150,00018,000,10018,000,10018,000,10018,000,100yes
lz4150,00018,000,10018,000,10018,000,10018,000,100yes

The full tx sequence table is included in the downloadable package.

Resource profile

The resource data uses process CPU percent, process memory, and disk IO time sampled from the Incus containers. Kafka CPU and IO are averaged across the three broker containers.

Snappy CPU
Snappy CPU

For snappy, the most consistent compression mode in this run:

TargetLeader CPUStandby CPUKafka CPULeader IOStandby IOKafka IOLag p99 tx
5,00022.163.623.583.8629.3187.30
10,00024.982.424.0119.0198.4156.382
20,00030.673.424.6130.0415.5180.9887
50,00042.075.326.3133.5287.1260.12,005
100,00067.044.528.5133.1562.1413.824,368
150,00098.129.339.8180.2838.9475.44,910,043

Lag p99 tx is sampled standby replication lag in transactions. It is a runtime pressure signal, not a tx sequence failure. Final leader, standby durable, and standby applied sequences all reached the expected sequence.

Standby lag p99
Standby lag p99

What this result says

The useful result is not a single headline TPS number. The run shows where pressure appears while keeping state completeness explicit:

  • The matrix contains 18 completed Kafka runs: 6 target rates times 3 compression modes.
  • 13 of 18 runs met both p95 < 7 ms and p99 < 10 ms while passing tx sequence completeness.
  • Snappy was the most consistent compression mode through 100K TPS.
  • LZ4 produced the cleanest 150K TPS result in this matrix: 147,966 measured TPS, p95 5.83 ms, p99 8.19 ms, and tx sequence completeness passed.
  • Kafka broker CPU was not saturated in the stable low-latency range. At higher targets, leader CPU and standby durable IO became more visible.
  • All final state completeness checks passed across leader committed state, standby durable TxLog, and standby applied state.

Until a newer run changes the evidence, 150K is the clean public performance anchor for this Kafka HA setup.