This report turns the 2026-05-12 Bank HA runner into a public engineering note.
The run used a three-node Kafka 4.2 KRaft cluster with replication factor 3 and one partition per benchmark topic. It is a measurement for this hardware, Incus placement, workload, and configuration. It is not a general product throughput ceiling.
Download the result package: CSV summaries, tx sequence integrity data, the original report, and generated charts.
Test setup
The run used an Incus-based environment on one physical host:
- CPU: AMD Ryzen 9 9950X3D, 16 cores / 32 threads.
- Memory: 249 GiB visible to Linux.
- Coordination: 3 etcd containers.
- Log and publication path: 3 Kafka 4.2 KRaft brokers, RF=3, one partition per benchmark topic.
- StateVec Bank runtime: one leader container and one standby container.
- Workload:
mixed_bank_flowfrombank_ha_host_runner.
The storage was ordinary consumer NVMe, split by role:
- infra containers on Lexar NM1090 PRO;
- Bank leader on Crucial T710;
- Bank standby on WD_BLACK SN8100.
The container sizing was intentionally modest:
- Kafka brokers: 3 containers, each 2C / 8GiB.
- etcd: 3 containers, each 1C / 1GiB.
- Bank leader and standby: each 4C / 16GiB.
The benchmark path was:
host runner
|
| commands
v
Kafka RF=3 command topic
|
v
Bank leader
|
| committed TxResult / TxLog
v
Kafka RF=3 txlog topic
|
v
Bank standby durable + apply
Bank leader -> Kafka RF=3 publication topic -> host publication observerEach benchmark case used a distinct run id and distinct command, replication, and publication topics.
What was measured
The report keeps four signals separate:
- Measured TPS: average command throughput over measured windows.
- Publication latency: sampled
event sent -> publication observedlatency. - Tx sequence completeness: leader committed tx, standby durable tx, and standby applied tx all reach the expected sequence.
- Standby lag p99: sampled p99 standby replication lag in transactions.
The low-latency target for this run was:
- p95 publication latency < 7 ms;
- p99 publication latency < 10 ms;
- tx sequence completeness must pass.
Each case used one warmup window and five measured windows. Warmup is excluded from the summaries.
Throughput and latency
Compression none
| Target | Avg TPS | p50 ms | p95 ms | p99 ms | p95<7 | p99<10 | tx_seq |
|---|---|---|---|---|---|---|---|
| 5,000 | 4,975 | 5.17 | 6.42 | 6.74 | yes | yes | yes |
| 10,000 | 9,956 | 5.28 | 6.21 | 6.43 | yes | yes | yes |
| 20,000 | 19,910 | 5.39 | 6.31 | 6.55 | yes | yes | yes |
| 50,000 | 49,787 | 5.74 | 66.80 | 84.01 | no | no | yes |
| 100,000 | 99,105 | 5.83 | 6.85 | 58.15 | yes | no | yes |
| 150,000 | 148,033 | 4.82 | 5.78 | 74.20 | yes | no | yes |
Compression snappy
| Target | Avg TPS | p50 ms | p95 ms | p99 ms | p95<7 | p99<10 | tx_seq |
|---|---|---|---|---|---|---|---|
| 5,000 | 4,975 | 5.05 | 6.35 | 6.68 | yes | yes | yes |
| 10,000 | 9,956 | 5.38 | 6.33 | 6.62 | yes | yes | yes |
| 20,000 | 19,901 | 5.53 | 6.45 | 6.78 | yes | yes | yes |
| 50,000 | 49,782 | 5.60 | 6.50 | 7.81 | yes | yes | yes |
| 100,000 | 99,299 | 5.92 | 6.78 | 7.33 | yes | yes | yes |
| 150,000 | 148,360 | 4.95 | 47.72 | 76.31 | no | no | yes |
Compression lz4
| Target | Avg TPS | p50 ms | p95 ms | p99 ms | p95<7 | p99<10 | tx_seq |
|---|---|---|---|---|---|---|---|
| 5,000 | 4,973 | 5.31 | 6.49 | 6.77 | yes | yes | yes |
| 10,000 | 9,958 | 5.37 | 6.30 | 6.60 | yes | yes | yes |
| 20,000 | 19,911 | 5.58 | 6.49 | 6.85 | yes | yes | yes |
| 50,000 | 49,790 | 5.66 | 6.60 | 134.46 | yes | no | yes |
| 100,000 | 99,244 | 5.92 | 6.77 | 7.37 | yes | yes | yes |
| 150,000 | 147,966 | 4.97 | 5.83 | 8.19 | yes | yes | yes |
Tx sequence completeness
All 18 Kafka matrix runs passed the tx sequence completeness gate. The gate checks:
leader_committed_tx_seq >= expected_tx_seqstandby_durable_tx_seq >= expected_tx_seqstandby_applied_tx_seq >= expected_tx_seq
| Compression | Target | Expected | Leader | Standby durable | Standby applied | Result |
|---|---|---|---|---|---|---|
| none | 150,000 | 18,000,100 | 18,000,100 | 18,000,100 | 18,000,100 | yes |
| snappy | 150,000 | 18,000,100 | 18,000,100 | 18,000,100 | 18,000,100 | yes |
| lz4 | 150,000 | 18,000,100 | 18,000,100 | 18,000,100 | 18,000,100 | yes |
The full tx sequence table is included in the downloadable package.
Resource profile
The resource data uses process CPU percent, process memory, and disk IO time sampled from the Incus containers. Kafka CPU and IO are averaged across the three broker containers.
For snappy, the most consistent compression mode in this run:
| Target | Leader CPU | Standby CPU | Kafka CPU | Leader IO | Standby IO | Kafka IO | Lag p99 tx |
|---|---|---|---|---|---|---|---|
| 5,000 | 22.1 | 63.6 | 23.5 | 83.8 | 629.3 | 187.3 | 0 |
| 10,000 | 24.9 | 82.4 | 24.0 | 119.0 | 198.4 | 156.3 | 82 |
| 20,000 | 30.6 | 73.4 | 24.6 | 130.0 | 415.5 | 180.9 | 887 |
| 50,000 | 42.0 | 75.3 | 26.3 | 133.5 | 287.1 | 260.1 | 2,005 |
| 100,000 | 67.0 | 44.5 | 28.5 | 133.1 | 562.1 | 413.8 | 24,368 |
| 150,000 | 98.1 | 29.3 | 39.8 | 180.2 | 838.9 | 475.4 | 4,910,043 |
Lag p99 tx is sampled standby replication lag in transactions. It is a runtime pressure signal, not a tx sequence failure. Final leader, standby durable, and standby applied sequences all reached the expected sequence.
What this result says
The useful result is not a single headline TPS number. The run shows where pressure appears while keeping state completeness explicit:
- The matrix contains 18 completed Kafka runs: 6 target rates times 3 compression modes.
- 13 of 18 runs met both p95 < 7 ms and p99 < 10 ms while passing tx sequence completeness.
- Snappy was the most consistent compression mode through 100K TPS.
- LZ4 produced the cleanest 150K TPS result in this matrix: 147,966 measured TPS, p95 5.83 ms, p99 8.19 ms, and tx sequence completeness passed.
- Kafka broker CPU was not saturated in the stable low-latency range. At higher targets, leader CPU and standby durable IO became more visible.
- All final state completeness checks passed across leader committed state, standby durable TxLog, and standby applied state.
Until a newer run changes the evidence, 150K is the clean public performance anchor for this Kafka HA setup.