One Shape Across the Eignex Stack
Three months into the Eignex rewrite, the libraries finally share one config shape that doubles as a YAML wire format. A checkpoint on what changed in each repo.
Three months on from the last status update, the Eignex rewrite has actually moved. The original plan was to split one tangled experimentation library into focused pieces, and that’s mostly what’s happened. Quick checkpoint on where things landed.
How the stack converged
The big shift since February is that kumulant and (soon) combo now share a single API shape for declaring computation graphs and config, both backed by skema. The same definition compiles as a typed Kotlin singleton object and serializes to a YAML or JSON document that a service can author by hand or POST over HTTP.
In practice, almost nobody actually calls these libraries in-process from Kotlin, the real callers are cloud deployments reading a YAML config. So designing the typed schema and the wire format as one thing rather than two saves a translation layer I would otherwise have had to maintain forever. The previous post, From Stringly to Strongly Typed, walked through how that design landed.
This happened repo by repo. kumulant moved first, since rewriting its closure-based transform graph into AST nodes was the work that originally surfaced the pattern. Combo is still pending, though the shape it’ll land on is pretty obvious by now. klause is the deliberate holdout here, since its constraint AST already has a perfectly good wire format and folding it onto skema would just be busywork.
The eventual integration story is that a combo policy declares its parameter space as a skema schema, lays klause constraints over the variables, and feeds observations through a kumulant aggregator that scores live variants. None of the wires themselves are new, what changed is just that they all now speak the same variable names and the same serialization format.
What changed in each repo
skema got cut into its own repo at 0.1.0. It’s really the meta-library underneath everything else, a tool for people writing other libraries whose users have to declare typed schemas that round-trip over the wire. A couple of additions worth mentioning from this cycle, SchemaDef.diff() for drift detection between two versions of a schema, and the composition operators (+ and namespaced()) for plugin-style assembly of larger schemas from smaller ones. The design rationale itself lives over in From Stringly to Strongly Typed.
klause was pulled out of combo at the start of the rewrite. It’s a constraint solver over mixed Boolean and integer variables, and inside combo its job is to carry the constraints the bandit has to respect when picking variants, so invalid combinations never reach the optimiser in the first place. The main differences from what combo originally shipped are that CSP-style integer domains now sit alongside Booleans in a single problem, a Z3 backend handles direct SMT, and a LogicNG adapter bit-blasts the integer side to CNF and hands off to a real SAT solver. The default LocalSearchSolver runs simulated annealing on the local-search-friendly subset, and a brute-force backend cross-checks all three on small instances so I can be sure the heuristic and exact solvers actually agree.
kumulant is doing two jobs inside the rewrite. It backs combo’s probabilistic gradient boosting machine with online accumulators, and it also provides the monitoring layer for the cloud-deployed combo service. The accumulators themselves (means, sums, decaying windows, plus probabilistic sketches like TDigest, ReservoirHistogram, and SpaceSaving) compose into a typed schema and update one value at a time. The bigger shift this cycle was actually the operation graph that sits above them, which now lives as AST-typed VectorExpr, ScalarExpr, and BoolExpr nodes, and that’s what brought kumulant onto skema’s wire-friendly shape in the end. In practice this means transforms and reductions are now plain data, so you can author a config as YAML and ship it through a deployment pipeline without a recompile.
The site itself
I’ve also started crossposting, so each post now goes out as a trimmed and retitled copy on dev.to. The rewriting needed to keep search engines pointing at the canonical version here isn’t exactly free, but a personal blog with no incoming links has to be discoverable somehow. I’m also on Bluesky now, that link sits in the footer next to GitHub.
Beyond syndication, the rest of the site has had a lot of polish along the way too. Images are zoomable on click, the icon set has grown a fair bit, and the front-page splash finally lays out cleanly across the horizontal/vertical and desktop/mobile combinations that had been giving me grief for months. I should probably stop caring this much about trivial details, but here we are.
What’s next
Combo is the obvious next thing to tackle. The plan is to reattach the learning side (GLM, random forest, and probabilistic gradient boosting) on top of a skema-typed schema, behind an HTTP boundary that takes a YAML-serializable config. Since kumulant is already on that shape, combo becomes the second user of an existing design instead of having to invent its own.
After combo, a hosted version finally becomes plausible, a managed server with a UI fed by the same configs the libraries already speak. That’s still some distance off though. The more concrete near-term milestones are an end-to-end example pinning a klause schema, a kumulant aggregator, and a combo policy together against a small synthetic A/B problem, and a 0.1.0 release of skema published to Maven Central so the rest of the stack can stop depending on a local mavenLocal install.
A personal note
I’m on part-time parental leave with twin babies, which is really the honest reason the cadence is what it is, and writing tends to happen in the gaps between bottles and naps. The slower pace has actually been good for the design work though, even if it does mean progress is slower than I’d like.