SysId, UK, and Indexes: The Exchange Test Case

The first design question was identity.

If an engine owns business state, every record needs a stable way to be identified. That sounds simple, but the choice affects storage, replay, lookup, and schema evolution. It also changes how domain code talks about state.

The first inspiration came from relational databases.

Relational systems separate internal row identity from business-facing access paths. A table may have a surrogate id, unique constraints, and indexes. Those ideas map surprisingly well to an engine-managed state model, as long as we keep the responsibilities separate.

The early design started with the simplest part: every record gets a SysId.

Why SysId was a good starting point

SysId is an engine-assigned identifier. Every record has one.

That gave the runtime a simple foundation:

the engine can allocate identity;
records have stable internal references;
lifecycle management is uniform;
storage can be organized without understanding every business key;
replay and recovery can refer to records consistently;
domain code can point to a specific record without copying a large key everywhere.

This was not a bad design. It was the right first layer.

record
  |
  +--> SysId          engine identity
  +--> schema fields  business data
  +--> revision       version / change tracking

The important point is that SysId gives the engine a stable handle for each record, separate from the business fields that may change or evolve.

But a stable handle is not the same thing as a business access path.

Where SysId was not enough

Business systems rarely ask only one question.

They do not just ask:

Which record has this internal id?

They also ask:

find an account by account id;
find an order by external order id;
scan orders for one instrument;
list open orders for one account;
traverse one side of an order book by price and priority;
find workflows by tenant and business key;
enforce uniqueness on fields that matter to the domain.

With only SysId, those questions have to be answered somewhere else. The domain layer starts building maps, byte-array keys, side indexes, or derived structures around the engine.

That is the warning sign. If every serious business module has to rebuild its own lookup and ordering layer, then the engine has not captured enough of the state model.

At this point, the design needed a better test case.

The exchange test case

The test case I kept coming back to was a trading exchange.

Not because the engine should only be used for exchanges, and not because every business system has an order book. The exchange case is useful because it is unforgiving. A matching engine forces the model to answer hard questions about ordering, identity, recovery, and determinism.

An order book has:

instruments;
bid and ask sides;
price levels;
time priority within a price;
partial fills;
cancels;
replaces;
recovery requirements.

The core rule is simple to say: match orders by price-time priority.

The modeling question is harder:

Where does that priority live?

The first tempting answer was to keep orders as canonical records and treat the book as derived state.

canonical orders
      |
      v
derived order book
      |
      +--> price levels
      +--> queues
      +--> best bid / best ask

That sounds clean, but it hides a problem.

If two orders sit at the same price, their relative priority determines future matching behavior. If the engine restarts and rebuilds the book in a different order, the future behavior can change. That means the ordering is not merely an optimization and not merely a cache.

It is business state.

The derived-state trap

The word "derived" can be misleading.

Some data is safely derived. A dashboard count can be recomputed. A read model can be rebuilt. A cached best bid can be refreshed from canonical records.

But some data only looks derived. If losing it changes future business behavior, then it belongs in the canonical model.

For an order book, price-time priority is one of those cases. The system must preserve enough information to recover the same ordering:

order
  |
  +--> order id
  +--> instrument
  +--> side
  +--> normalized price
  +--> priority sequence

Once that became clear, the design moved from a single SysId model to a three-part model: SysId, UK, and canonical index.

UK and index do different jobs

The current design does not use one overloaded key concept.

It separates three responsibilities:

record
  |
  +--> SysId       engine-owned identity
  +--> UK          business uniqueness / exact lookup
  +--> index       deterministic ordered access path

SysId is for engine identity. It gives the runtime a stable internal handle for storage, references, lifecycle, logging, and recovery.

UK is for business uniqueness. It answers questions like:

find this order by order id;
find this asset by asset id;
reject a duplicate client order id;
enforce that a business identifier belongs to only one committed record.

UK supports full-key exact lookup. It is not the place for prefix list or range scan APIs.

Canonical index is for deterministic ordered access. It answers questions like:

list orders for one account;
list open orders for one account;
scan one side of an order book by price and sequence;
find the first N records under a business prefix;
run a bounded prefix range scan.

The prefix part matters. A canonical index over:

(account_id, status, created_seq)

can support access patterns such as:

list(account_id)
list(account_id, status)
range(account_id, status, created_seq_from..created_seq_to)

That is different from a projection index for UI queries. A canonical index participates in transaction decisions, so it must be deterministic, bounded, overlay-visible during execution, and rebuildable from canonical records during snapshot load or replay.

What the exchange case changed

For the exchange test case, an order record may need both UK and indexes.

For example:

UK:
  by_order_id(order_id)
  by_client_order_id(account_id, client_order_id)

index:
  by_book_side_price_seq(instrument, side, normalized_price, priority_seq)

The UKs protect identity and uniqueness. The index expresses traversal order.

The exact encoding is an implementation detail. For example, bid and ask sides may need different price ordering, so the engine can normalize price into an order-preserving key segment. What matters is that the business ordering is explicit in schema metadata instead of hidden in an ad hoc in-memory structure.

This made the model less minimal, but much more useful.

Closing thought

The exchange test case was valuable because it made vague design questions concrete.

It forced the model to answer:

what is engine identity?
what is business uniqueness?
what ordering must survive recovery?
which lookup patterns belong in schema metadata?
which structures are safe to derive, and which are actually canonical state?

The lesson was not that every system is an order book.

The lesson was that a hard trading use case can reveal whether the state model is expressive enough for real business engines.