schema
Typed, named, wire-portable schemas for declaring bags of stats. A StatSchema does three things:
Lets you read results back by typed StatKey instead of by string.
Materialises into a StatGroup that fans every update out to every registered stat.
Round-trips on the wire as a
StatSchemaDefso a remote process can stand up the same bag, run it independently, and ship snapshots back to merge.
Declaring a schema
val telemetry = object : StatSchema(concurrency = Concurrency.Strict) {<br> val latencyMean by series(Mean)<br> val errorRate by series(Rate)<br>}<br>val group = StatGroup(telemetry)<br>group.update(42.0)<br>val results = group.read()<br>println(results[telemetry.latencyMean].mean)The series, paired, vector, and discrete declarators register a StatSpec of the matching modality and return a StatKey carrying the result type. The delegate gives you a typed property; you read results by passing the key back to the group's GroupResult. The group declarator nests a sub-schema, whose entries materialise as their own StatGroup keyed by the outer name.
Specs
Every concrete stat has a sibling StatSpec: a data class (or data object for parameter-less stats) carrying only configuration. Specs are @Serializable with @SerialName discriminators matching the Kotlin class names, so polymorphic serialization puts the same type strings on the wire regardless of format (JSON, CBOR, Protobuf).
Construction lives separately from declaration. Calling spec.materialize(concurrency) builds the live stat. The schema layer calls this for you when you build a StatGroup from a schema.
Specs carry no Concurrency. The concurrency mode is a deployment knob passed at materialize time, so the same wire payload can run at Concurrency.None in a single-threaded test and Concurrency.Strict in a contended hot loop.
Composing specs
Every operation in com.eignex.kumulant.operation has a spec form. The lambda-bound operations (filter, transform, transformPair, foldVector, foldPaired) take an AST on the spec side so the projection / predicate travels as data:
val positiveMean = Mean.filter(X gt 0.0).withWeight(0.5).windowed(1.minutes)The AST DSL covers comparison (gt, ge, lt, le, eq), boolean combinators (and, or), and arithmetic on X, Y, V(index), and Const(v). Sugar nodes such as Switch, In, Standardize, and MinMax make per-feature projection AST trees readable. Anything that cannot be expressed in the AST stays live-only.
Running a group
StatGroup is itself a com.eignex.kumulant.core.SeriesStat over GroupResult. It can be nested inside another stat, windowed, or merged with another group's GroupResult. PairedStatGroup, VectorStatGroup, and DiscreteStatGroup variants fan updates only to entries of the matching modality.
Shipping over the wire
val spec: StatSpec = Mean<br>val json = SchemaJson.encodeToString(spec)<br>val decoded = SchemaJson.decodeFromString<StatSpec>(json)<br>val live = (decoded as SeriesStatSpec<WeightedMeanResult>).materialize(Concurrency.None)<br>live.update(1.0)The materializeSeries, materializePaired, materializeVector, and materializeDiscrete variants enforce that every entry matches the expected modality. The unfiltered materialize returns a list of bound stats and leaves the caller to split.
Merging across processes
Each Result is a serializable data class. A common pattern is to have many workers run the same StatGroup, periodically call read on the group, and ship the GroupResult (or its per-entry results) back to a coordinator. The coordinator runs its own StatGroup of the same shape and folds each worker's snapshot in with merge. Because merge takes a Result rather than a live com.eignex.kumulant.core.Stat, the boundary is serialisation-friendly and the worker is free to terminate after each report.
Optimizers
Linear-model stats (com.eignex.kumulant.stat.regression.glm.StochasticRegressionStat, com.eignex.kumulant.stat.regression.SoftmaxRegressionStat) take an OptimizerSpec that materialises into an com.eignex.kumulant.stat.regression.Optimizer. The wire variants are Sgd, Adagrad, Rmsprop, and Adam.
Types
Internal base shared by StatGroup, PairedStatGroup, and VectorStatGroup. Holds the spec list and provides the modality-agnostic read / merge / reset implementations.
Spec for AccuracyStat: weighted classification accuracy over (predictedClass, trueClass).
Adagrad. Per-coordinate adaptive learning rate via accumulated squared gradients: w[i] -= lr * grad[i] / sqrt(sumG2[i] + epsilon).
Adam. Bias-corrected first and second moments per coordinate (Kingma & Ba 2015); the general-purpose default in modern online learning. Per-coordinate update:
Spec for AdwinStat: ADWIN2 adaptive-windowing change detector.
Per-observation decay: each new sample carries weight alpha against the running estimate.
Spec for BayesianRegressionStat: closed-form Gaussian linear regression with isotropic prior.
Spec for BernoulliSumStat: weighted count of nonzero inputs.
Spec for BloomFilterStat: probabilistic set membership.
Wire-serialisable AST for boolean predicates over the same input environment as ScalarExpr. Consumed by every spec-side filter operator and by IfExpr's condition slot.
Spec for BrierScoreStat: mean squared error against y in {0, 1} for probabilistic predictions.
Reads the center field of the feedback primary's snapshot. Requires the primary's result to implement HasCenterScale; raises IllegalStateException when evaluated without a primary, or when the primary's result does not expose a center.
Spec for ConfusionMatrixStat: K-by-K weighted confusion matrix over (predictedClass, trueClass).
Wire spec for a constant scalar.
Spec for CountStat: unweighted observation count.
Spec for CounterRateStat: rate inferred from a monotonically increasing counter.
Spec for CountMinSketchStat: approximate frequency table for unbounded-cardinality streams.
Spec for CovarianceStat: weighted covariance and Pearson correlation between two streams.
Spec for CrossingStat: counts upward and downward crossings of a configured level.
Spec for CusumStat: two-sided cumulative-sum change-point detector.
Spec for DDSketchStat. probabilities is a List on the wire because most formats serialize lists more cleanly than primitive arrays; converted to a DoubleArray at materialize time.
Spec for DecayingMeanStat: time-decayed running mean with HalfLife weighting.
Spec for DecayingRateStat: events-per-second with exponential time decay.
Spec for DecayingSumStat: time-decayed running sum with HalfLife weighting.
Spec for DecayingVarianceStat: time-decayed running variance with HalfLife weighting.
Wire-friendly counterpart of DecayWeighting. The two strategies are split by field type rather than discriminated union so each decay-stat spec can statically constrain itself to the right strategy (e.g. DecayingSum only accepts HalfLife).
Spec for DecisionTreeClassifierStat: online VFDT classification tree.
Spec for DecisionTreeRegressionStat: online VFDT regression tree.
Spec for DiagonalRegressionStat: factorised-Gaussian posterior with per-coordinate precision.
StatGroup variant over discrete (Long) inputs.
StatSpec that materializes into a com.eignex.kumulant.core.DiscreteStat with result type R.
Spec for EwmaMeanStat: exponentially-weighted moving average with per-observation Alpha.
Spec for EwmaVarianceStat: exponentially-weighted moving variance with per-observation Alpha.
Spec for ExcursionStat: running peak with the largest peak-to-trough excursion observed.
Spec for FrugalQuantileStat: O(1)-memory single-quantile estimator.
Spec for GaussianNaiveBayesStat: online Gaussian naive Bayes classifier.
Spec for GaussianScorerStat: running mean / variance with |x - mean| / stdDev z-score.
Marker interface for stats whose result is a GroupResult. Implemented by StatGroup and its modality variants. Used in the group declarator below to enforce that nested-group entries actually produce GroupResult rather than some other Result shape.
Aggregated snapshot of a StatGroup: a map from StatKey.name to the per-slot Result. Backs the result side of the schema layer; use the get(StatKey) operators below for type-safe lookup rather than going through results by string key.
Serializable spec for a nested series-modality StatGroup. Holds a recursive map of StatSpec entries keyed by name; every entry must itself be a SeriesStatSpec; materialization happens in StatFactory.kt.
Wall-clock half-life decay: weight halves every durationMillis.
Spec for HalfSpaceTreesStat: online ensemble of random half-space trees for multivariate anomaly scoring.
Spec for HdrHistogramStat: high-dynamic-range histogram.
Reads the max field of the feedback primary's snapshot. Requires the primary's result to implement HasMinMax; raises IllegalStateException when evaluated without such a primary.
Spec for HoltStat: double exponential smoothing with optional trend damping.
Spec for HyperLogLogStat: cardinality sketch with controllable precision.
Wire spec for a ternary if (cond) then else otherwise.
Spec for IsotonicCalibratorStat: binned isotonic calibrator over [0, 1].
Spec for LinearCountingStat: cardinality estimator backed by a bitset of bits cells.
Spec for LinearHistogramStat: fixed-width bins over [lowerBound, upperBound).
Spec for LogLossStat: mean negative log-likelihood for binary y with predicted probability x.
Reads the min field of the feedback primary's snapshot. Requires the primary's result to implement HasMinMax; raises IllegalStateException when evaluated without such a primary.
Spec for MadStat: streaming median and median absolute deviation via two t-digests.
Spec for MaeLossStat: mean absolute error |y - yhat|.
Spec for MaxStat: running maximum.
Spec for MeanStat: weighted running mean.
Spec for MinStat: running minimum.
Spec for MinHashStat: Jaccard-similarity signature over numHashes independent hash functions.
Min-max projection mapping X from [primary.min, primary.max] to [targetLow, targetHigh], emitting targetLow while the running range is still degenerate. Reusable AST sugar for the min-max-scaler pattern; requires a HasMinMax primary.
Spec for MomentsStat: mean / variance / skewness / kurtosis (Welford).
Spec for MseLossStat: mean squared error (y - yhat)^2.
Wire-portable optimizer strategy. Sealed root of Sgd / Adagrad / Rmsprop / Adam; consumed by the online linear-model stats (com.eignex.kumulant.stat.regression.glm.StochasticRegressionStat, com.eignex.kumulant.stat.regression.SoftmaxRegressionStat) to pick the per-coordinate update rule.
Spec for PageHinkleyStat: Page-Hinkley change-point detector.
StatGroup variant over paired (x, y) inputs.
StatSpec that materializes into a com.eignex.kumulant.core.PairedStat with result type R.
Spec for PairedSumStat: tracks per-axis sums of (x, y) updates.
Spec for PinballLossStat: quantile (pinball) loss at quantile tau.
Spec for pitHistogram(numBins): PIT-style equiprobable histogram for calibration checks.
Spec for PlattCalibratorStat: one-feature logistic regression fitting sigmoid(a*x + b).
Spec for QuantileFilterStat: DDSketch-backed quantile-threshold anomaly detector.
Spec for RandomForestClassifierStat: ensembled VFDT classification forest.
Spec for RandomForestRegressionStat: ensembled VFDT regression forest.
Spec for RangeStat: running min and max as a pair.
Spec for RateStat: events per second over the observed wall-clock span.
Spec for RecencyStat: time elapsed since the most recent observation.
Spec for RecursiveVarianceStat: sigma^2_t = omega + alpha * value^2 + beta * sigma^2_{t-1}.
StatSpec that materializes into a com.eignex.kumulant.core.RegressionStat with result type R.
Spec for ReliabilityStat: per-bin calibration table (mean predicted vs observed frequency).
Per-bucket reduction used by the ResampleByTimeSeries spec when aligning an input series onto fixed wall-clock buckets. Configured on a spec; the materializer threads it through to the runtime bucket aggregator.
Configuration for ReservoirHistogramStat. Seed has no default - the live constructor's Random.Default.nextLong() is non-deterministic, which would silently produce different goldens on each instantiation if mirrored here.
RMSProp. Per-coordinate adaptive learning rate via an exponential moving average of squared gradients: the same shape as Adagrad but with a sliding window instead of a monotone accumulator.
Spec for RunLengthStat: current and longest consecutive truthy-run lengths.
Wire-serialisable AST for scalar expressions over the per-update input environment. The library uses these wherever a stat needs to apply a caller-supplied projection / weight / threshold expression that has to round-trip on the wire; weightBy, transform, the per-bin scaler projections, the WithFeedback op, the loss / pinball / quantile configurations.
Reads the scale field of the feedback primary's snapshot. Requires the primary's result to implement HasCenterScale; raises IllegalStateException when evaluated without a primary, or when the primary's result does not expose a scale.
Spec for SeasonalSmoothingStat: triple exponential smoothing (Holt-Winters).
StatSpec that materializes into a com.eignex.kumulant.core.SeriesStat with result type R.
Plain stochastic gradient descent. The default and the cheapest entry; stateless apart from the global step counter feeding the learning-rate schedule. Per-coordinate update: w[i] -= lr(step) * weight * grad[i].
Spec for SoftmaxRegressionStat: multinomial (K-way) logistic regression.
Spec for SojournStat: per-state time, transition counts, and current dwell over a declared alphabet.
Spec for SpaceSavingStat: top-capacity heavy-hitters tracker.
Z-score projection: (X - Center) / Scale, emitting 0 when Scale is still zero. Reusable AST sugar for the standard-scaler pattern; requires a HasCenterScale primary.
Fans each update out to a heterogeneous list of SeriesStats and reports their results keyed by name.
Typed handle to one entry in a StatSchema / GroupResult. Carries the result type R as a phantom type so reading back from a GroupResult produces a typed value rather than an Any:
Declarative, typed schema for a group of stats, layered on top of com.eignex.skema.Schema<StatSpec> so the entries map is wire-serializable.
Pure-data, serializable form of a StatSchema. The wire field is stats (kumulant convention, customised over skema's default entries). Decode an incoming wire payload into this type and rehydrate a live group via materializeSeries (or one of the modality-specific variants).
Pure-data recipe for a com.eignex.kumulant.core.Stat. Each variant is a data class (or data object for parameter-less stats) whose fields are exactly the stat's configuration surface - no live cells, no locks, no com.eignex.kumulant.core.Concurrency.
Spec for StochasticRegressionStat: online GLM with a configurable optimizer.
Spec for SumStat: weighted running sum.
Spec for SummaryStat: mean / variance / min / max in one result, useful as a primary for mixed-scaler feedback projections.
Multi-way branch on a scalar key. Replaces nested IfExpr cascades. The first case whose SwitchCase.value equals on.eval(...) exactly wins; if none match, otherwise is returned.
Spec for TDigestStat: streaming t-digest quantile sketch.
Spec for ThresholdBucketStat: weighted counts per user-defined value bucket.
Spec for TotalWeightsStat: cumulative observation weight.
Spec for UnivariateRegressionStat: scalar OLS / Lasso / Ridge depending on penalty.
v[index] - out-of-bounds throws at eval time.
Spec for VarianceStat: weighted running variance (Welford).
Wire-serialisable AST for vector-valued expressions over the same input environment as ScalarExpr / BoolExpr. Used by the spec-side transformVector(VectorExpr) operator when the output is a fresh vector rather than a per-element transform.
StatGroup variant over vector inputs.
StatSpec that materializes into a com.eignex.kumulant.core.VectorStat with result type R.
In element-wise feedback contexts (vector / regression / paired), returns the coordinate index of the currently evaluating element as a Double. Outside such contexts (primary is not an IndexedResult), raises IllegalStateException.
Refers to the primary scalar input x.
Refers to the secondary scalar input y (paired stats only).
Functions
Adapt a series spec into a discrete spec - the discrete sees value.toLong() per update.
Adapt a discrete spec into a series spec - the series sees value.toDouble() per update.
Adapt a series spec into a vector spec by consuming the index-th coordinate of each vector.
Adapt a series spec into a paired spec by consuming the x component of each pair.
Adapt a series spec into a paired spec by consuming the y component of each pair.
Wrap this series spec to expose a [lower, upper] band of width k * scale around center.
Wrap this series spec to forward the per-second time derivative of the value stream.
Wrap this series spec to forward the k-th difference value - value[t - k].
Build Div of two expressions.
Divide this expression by a literal rhs.
Divide a literal receiver by an expression.
Exact equality (no tolerance).
Exact equality against a literal (no tolerance).
Wrap this discrete spec so updates are forwarded only when pred evaluates true.
Wrap this paired spec so updates are forwarded only when pred evaluates true on (x, y).
Wrap this regression spec so updates are forwarded only when pred evaluates true.
Wrap this series spec so updates are forwarded only when pred evaluates true.
Wrap this vector spec so updates are forwarded only when pred evaluates true on the full vector.
Lift this series spec to a paired spec, reducing every (x, y) to a scalar via expr.
Lift this series spec into the regression modality. project reduces each (x = V, y = Y) update to a scalar that the inner series stat absorbs. Use Y for the marginal-y view.
Lift this series spec to a vector spec, reducing every vector to a scalar via expr.
Greater-or-equal comparison.
Greater-or-equal against a literal.
Build a nested-group BoundStat. build is invoked with keys (the sub-schema's key handle) and must return a GroupedStat; typically a StatGroup constructed against that sub-schema. The resulting BoundStat uses a GroupStatKey so dotted lookup result[outerKey][innerKey] compiles.
Strictly-greater-than comparison.
Strictly-greater-than against a literal.
Wrap this series spec to debounce its input into a 0.0/1.0 stream via two-threshold hysteresis.
Wrap this series spec to forward the value seen k updates ago.
Less-or-equal comparison.
Less-or-equal against a literal.
Strictly-less-than comparison.
Strictly-less-than against a literal.
Construct a live DiscreteStat from a DiscreteStatSpec. See SeriesStatSpec.materialize.
Construct a live PairedStat from a PairedStatSpec. See SeriesStatSpec.materialize.
Construct a live RegressionStat from a RegressionStatSpec. See SeriesStatSpec.materialize.
Construct a live SeriesStat from a SeriesStatSpec. One when per modality, one cast at the boundary - sealed-hierarchy exhaustiveness keeps the cast safe.
Materialize every entry, regardless of modality. Caller filters by stat type.
Construct a live stat from any StatSpec, dispatching on its modality. Useful for code paths (like StatSchemaDef.materialize) that iterate over an erased Map<String, StatSpec> and don't statically know the modality.
Construct a live VectorStat from a VectorStatSpec. See SeriesStatSpec.materialize.
Materialize discrete-modality entries only; throws if any entry isn't discrete.
Materialize paired-modality entries only; throws if any entry isn't paired.
Materialize series-modality entries only; throws if any entry isn't series.
Materialize vector-modality entries only; throws if any entry isn't vector.
Element-wise min-max scale a regression spec's feature vector.
Element-wise min-max scale a vector spec against a hidden per-coordinate Range primary.
Min-max scale both axes of a paired spec against per-axis Range primaries.
Build Sub of two expressions.
Subtract a literal rhs from this expression.
Subtract an expression from a literal receiver.
Build Add of two expressions.
Add a literal rhs to this expression.
Add an expression to a literal receiver.
Wrap this series spec to forward one per-bucket summary using aggregator.
Element-wise standardise a regression spec's feature vector.
Element-wise standardise a vector spec against a hidden per-coordinate Variance primary.
Build a BoundStat from an existing StatKey and a live stat. Used when the key was created elsewhere (e.g. via a StatSchema declarator) and you want to pair it with a specific live instance; common in tests and in materializer code paths.
Build a BoundStat from a string name and a live stat. Convenience for ad-hoc groups: BoundStat(StatKey(name), value) with type inference.
Wrap this discrete spec so it only sees one in every every updates.
Wrap this paired spec so it only sees one in every every updates.
Wrap this regression spec so it only sees one in every every updates.
Wrap this series spec so it only sees one in every every updates.
Wrap this vector spec so it only sees one in every every updates.
Build Mul of two expressions.
Multiply this expression by a literal rhs.
Multiply a literal receiver by an expression.
Wrap this vector spec to apply expr to every element of each incoming vector before update.
Wrap this vector spec so each incoming vector is remapped through expr before update.
Map only the x coordinate; y stays as-is.
Wrap this regression spec so x is remapped by expr before the inner stat sees it.
Map only the y coordinate; x stays as-is.
Wrap this regression spec so y is remapped by expr before the inner stat sees it.
Unary minus: wraps in Neg.
Lift a series spec to a vector spec by replicating it across every coordinate of an N-dim input.
Wrap this discrete spec so every update's weight is multiplied by expr.eval(value.toDouble()).
Wrap this paired spec so every update's weight is multiplied by expr.eval(x, y).
Wrap this regression spec so every update's weight is multiplied by expr.eval(0, y, v).
Wrap this series spec so every update's weight is multiplied by expr.eval(value).
Wrap this vector spec so every update's weight is multiplied by expr.eval(0, 0, vec).
Wrap this discrete spec in a sliding time window of durationMillis split into slices buckets.
Wrap this paired spec in a sliding time window of durationMillis split into slices buckets.
Wrap this series spec in a sliding time window of durationMillis split into slices buckets.
Wrap this vector spec in a sliding time window of durationMillis split into slices buckets.
Wrap this inner series spec with a feedback primary; the projection AST sees the primary snapshot.
Adapt a paired spec into a series spec by pinning x to fixedX.
Adapt a paired spec into a series spec by pinning y to fixedY.
Lift a paired spec into a series spec by self-pairing each input with the value seen k updates ago.
Adapt a paired spec into a series spec by using the update timestamp as x.
Adapt a paired spec into a series spec by using the update timestamp as y.
Wrap this discrete spec so every update applies the per-observation weight multiplier.
Wrap this paired spec so every update applies the per-observation weight multiplier.
Wrap this regression spec so every update uses weight regardless of caller input.
Wrap this series spec so every update applies the per-observation weight multiplier.
Wrap this vector spec so every update applies the per-observation weight multiplier.