Machine-learning high-frequency trading infrastructure

A trading platform with a closed research-to-live loop.

One engine for backtest and live, proven identical by a parity gate, not a promise.

Explore the platform How it's verified

(v0.11.0)

Stack

Built with a serious stack.

What the platform runs on, and the ML layer being built into it.

Languages

C++
Python

Comms

gRPC
ZeroMQ
Redpanda

Data

ClickHouse
PostgreSQL
NumPy
pandas

Compute

Ray
Spark
Dask

TensorFlow
scikit-learn
Hugging Face

MLOps

MLflow
DVC
ONNX

Infrastructure

Docker
Google Cloud
Caddy
Envoy

Interfaces & viz

React
Next.js
Svelte
Bun
Leaflet
Plotly

On-prem hardwareplanned

GPU model training

NVIDIA GPUs with CUDA and cuDNN for training and low-latency inference, the compute under the ML layer.

NVIDIACUDAcuDNN

Distributed compute

Ray across a multi-node cluster, orchestrated on Kubernetes and wired with high-bandwidth NVLink / InfiniBand.

Ray

KubernetesNVLink

Data & model storage

NVMe-backed ClickHouse for time-series, plus an object store for datasets, features, and model artifacts.

NVMe

ClickHouse

MinIO

ML operations

Experiment tracking, data + model versioning, and a registry, reproducible from raw data to a served model.

MLflow

DVC

ONNX

17
Services
91
Strategies
21
Indicators
44
Integration Suites
5
Position Sizers
4
Cost Models

Research at scale

ML that fans out across a cluster.

A warehouse of market history, fetched over gRPC and bridged into Python, then fanned across a Ray cluster on GCP. Hidden-Markov regime fits with Gaussian emissions, parameter sweeps, and perturbation studies, Monte-Carlo, ablation, walk-forward. The same study on a laptop or on hundreds of cores.

Huge dataset, ELT'd

Collectors extract, load, and transform market, macro, and alt-data into one store.

Columnar warehouse

A local ClickHouse store, microsecond-stamped, multi-asset, pre-aggregated, built for billions of rows.

gRPC fetch

Services pull exactly the slices they need, straight from the warehouse over a binary wire.

ZenoBridge

Carries data and compute across the C++ ↔ Python boundary, capability-agnostic.

Ray · GCP

One study across the whole cluster, in parallel. Pin it to your laptop or a GCP Ray cluster with no code change, a scripted deploy stands the cluster up, and studies are listed and cancelled over the same gRPC API.

Fanned across the clusterGaussian HMM fitsParameter sweepsWalk-forwardMonte-CarloAblation

Methodology

Determinism is tested, not asserted.

Seven guarantees the platform enforces, each a gate, contract, or test, not a slogan.

Backtest can't drift from live

A parity gate forces research to reproduce the live run bit-for-bit, or the build fails.

parity golden fixture

No job is lost on a crash

A hard kill, then restart, proves jobs re-drain and terminals persist atomically.

v0.9.0 gate · green 3/3

No look-ahead bias

Walk-forward folds are tested, and reading past the current index is structurally forbidden.

rolling out-of-sample folds

No false-green, tests or data

Suites probe their dependencies and skip honestly instead of faking a pass; and a backtest or stream whose query faults fails loud, never an empty success a client would read as “no data.”

44 suites · fail-loud streams

Claims stay falsifiable

Failure conditions are pre-committed, with a predicate on every causal-graph edge.

MRIE · 7-node / 10-edge graph

Bad numbers get withdrawn

A reading traced to era-contamination was pulled. Failures are kept as findings.

MRIE · withdrawn forecast

Disconnect and bad input can't take it down

Every long-lived stream reaps clients that drop, and malformed input is rejected before it can fault a worker thread, swept across the fleet by two robustness audits.

v0.11.0 · two robustness audits

Infrastructure

One wire contract, research to live.

Seventeen services, one wire contract, the same from backtest to live.

Columnar Tick Store

A columnar tick store at five resolutions. The same surface drives backtest, paper, and live.

Service Mesh

17 services on one low-latency RPC mesh, each on a shared base class.

Execution Gateway

A single, clean broker-agnostic boundary. Paper trading by default.

Observable Wire

Real p50 / p95 / p99 per RPC, with correlation IDs that follow a call across services.

Typed Instrument Model

Eight security types, fixed-precision decimal money, closed enums at the boundary.

Broker Reconciliation

Positions, cash, and margin reconciled to the cent against the broker of record.

Architecture

Why it's built this way.

A handful of deliberate decisions, each a tradeoff we can point at.

Accepting an RPC never blocks I/O.

Each service runs separate accept and IO completion queues, so taking a new connection can't head-of-line-block work already in flight.

split completion queues

Every service speaks the same gRPC, from one place.

The request lifecycle, thread-pool dispatch, the self-respawning acceptor, and alarm-driven streaming, no thread per subscriber, live in one shared base every service binary inherits, with fan-out in one shared, lock-disciplined registry instead of a copy per service. A concurrency fix is one edit, in one tested place, not seventeen.

one shared base class

Reliable where it matters, lossy where it should be.

A reliable transport carries mutations, control, and reliable streams; a lossy pub/sub transport carries high-rate market data where dropping the stalest frame is the correct behaviour. A documented rule decides which.

two transports, on purpose

The store never re-stamps.

Timestamps are owned by the producer and preserved end-to-end. Persistence records what happened when it happened, not when it was written.

producer-owned time

Never float.

Prices, P&L, and cash are fixed-precision decimal throughout, so rounding error can't quietly accumulate into the books.

decimal money

The right database for the shape of the data.

High-rate time-series goes through a single columnar gateway that buffers and writes in batches off the hot path, and drains on shutdown so nothing in flight is lost. Relational, constraint-bearing records go to a separate transactional store with a prepared-statement pool and forward-only migrations. A documented rule decides which.

two stores, on purpose

An internal, in-process messaging benchmark measured ~17µs p99 over 100k frames, a design datapoint, not a production SLO. No live latency numbers are committed yet.

Distributed by design

One mesh per exchange. One replicated core.

Local where speed matters, central where truth matters, and no single box anything depends on.

Designed to survive any node or region, quorum holds.

By designPer-exchange meshesNo single point of failureSharded + replicated dataDelegated risk budgetFail-static, fail-closed

The closed loop

Prototype in research. Run in production.

One engine crosses from notebook to live book, proven identical by a parity gate.

Closed Loop

Research reads the live store, fits, and writes back: one substrate, not two that drift.

Parity Gate

Research must reproduce the live golden run bit-for-bit, or the build fails.

Walk-Forward Validation

Rolling out-of-sample folds, first-class and tested. Lookahead-freedom is a typed contract.

Research Bridge

Regime fits, sweeps, and scenarios fanned across a cluster, each study durable and idempotent.

One Engine, Research to Live

The strategy you backtested runs in paper and live unchanged. Promotion is config, not a rewrite.

Operator Control

Drain, shut down, tail logs, and watch tail latency: the whole mesh from one console.

Integrations

Plug into your trading stack.

At the center of the loop: ingest, route, observe, persist. One wire contract.

zenoα

Machine-learning high-frequency trading platform

Service meshClosed loop

BacktestWalk-forwardTear-sheets

Market data

Live feeds
Recorded replay
Tick to bar

Ingest

Brokers & venues

Broker-agnostic gateway
Paper & live
Order lifecycle

Route

Observe

Dashboards & ops

Operator console
Live telemetry
Tail latency

Study

Research & models

Regime studies
Parameter sweeps
Walk-forward

Persist

Persistence

Columnar store
Relational store
Pre-aggregated views

Operate & verify

Run it from one console.

The whole mesh is operated and observed from a single control plane, deploy across regions, stream telemetry and logs, drill into any component, and drain or shut it down live.

Every site on one map, live, healthy, or planned, across regions.

Capability map

What is built.

Seventeen services, traced as one pipeline: sources to output, control plane on top.

Control plane

01Network HubControl plane

02Session EngineOrchestrator

12Research WorkerResearch bridge

orchestrates the fleet

Sources

03Broker GatewayBroker-agnostic

04Historical ReplayRecorded to live

Stream processing

05Tick ConsolidationTick to bar

06Time-Slice SyncCross-symbol batching

07IndicatorsTwenty-one, streaming

08Securities RegistryTyped instruments

Decision

09Execution LoopNinety-one strategies

10SchedulingSessions & timers

11Backtest EngineEvent-driven

Order lifecycle

13Order LifecycleOCA + brackets

14Positions & P&LMargin-aware

15Cash & FX LedgerMulti-currency

Output

16Performance StreamLive tear-sheet

17Persistence StoreColumnar + relational

Etymology

Reductio ad absurdum, applied to alpha.

We treat every strategy and every model as a proposition that has to survive. Parity gates, walk-forward validation, closed catalogs, and a documented adversarial-review trail stand between a hypothesis and live capital, and when a result doesn't hold up, we withdraw it rather than dress it up.

Engraved portrait of Zeno of Elea, the pre-Socratic Greek philosopher

A philosopher, and a benchmark.

Zeno honours Zeno of Elea (c. 490–430 BCE), the pre-Socratic philosopher whose paradoxes pioneered proof by contradiction (reductio ad absurdum) and whose doctrine of ontological pluralism held that reality admits many coexisting truths. Both ideas sit at the core of how we reason about markets.

Alpha (α) is the Greek letter that finance adopted as the benchmark for outperformance: the excess return that cannot be explained by the market itself.

See whether it holds up.

Read the methodology, trace the topology, and judge the rigor yourself.

How it's verified Talk to us