kumulant

stat.summary

Exact running aggregates over a stream of scalars. Memory is constant in the stream length; updates are O(1); every entry merges cleanly across parallel workers via Chan-style parallel formulas or commuting cell arithmetic.

This is the largest stat family and the one most other packages compose against; change detectors run over a MeanStat / VarianceStat internally, calibration runs over outcome means via MeanStat, regression carries covariance through WeightedVarianceResult-shaped cells, and so on.

The trivial entries

SumStat, MinStat, MaxStat, and RangeStat are single-cell stats: one Double of state, O(1) update, exact under every com.eignex.kumulant.core.Concurrency level via independent commuting cell arithmetic (SumStat) or a CAS loop on a single cell (MinStat / MaxStat). RangeStat is a MinStat and MaxStat tracked together.

CountStat, TotalWeightsStat, and BernoulliSumStat look similar but answer different questions. CountStat counts updates, ignoring weight. TotalWeightsStat sums weight, ignoring the value. BernoulliSumStat sums weight only when the value is nonzero, which is the natural counter for binary outcomes (click-or-not, pass-or-fail). All three are naively additive and gain a striped-adder JVM path under com.eignex.kumulant.core.Concurrency.HighWrite.

The Welford family

MeanStat, VarianceStat, and MomentsStat use Welford's recurrence to keep mean / variance / higher moments numerically stable for streams that span many orders of magnitude. A naive sum-divided-by-count loses precision as the running sum grows; Welford updates the running mean in place and accumulates only mean-centred deviations.

The cells are coupled: an updated count needs its matching updated mean to stay consistent. Under com.eignex.kumulant.core.Concurrency.Strict or com.eignex.kumulant.core.Concurrency.HighWrite the body is locked. Under com.eignex.kumulant.core.Concurrency.Relaxed the cells race without a lock; readers and writers may drift by ULPs of the workload under contention but the stat never throws.

StatResultTracks
MeanStatWeightedMeanResultrunning mean, total weight
VarianceStatWeightedVarianceResultmean + variance
MomentsStatMomentsResultmean + variance + skewness + kurtosis (third and fourth central moments)
SummaryStatSummaryResultmean + variance + min + max in one accumulator; useful as a primary for mixed-scaler feedback projections

WeightedVarianceResult implements both com.eignex.kumulant.core.HasSampleVariance and com.eignex.kumulant.core.HasCenterScale; SummaryResult implements both of those plus com.eignex.kumulant.core.HasMinMax, which is what makes it valuable as a feedback primary; one accumulator covers both the standardize and min-max projections downstream.

Robust dispersion

MadStat tracks the running median and median absolute deviation via two t-digests: one over raw values and one over absolute deviations from the running median. Reach for it as the robust analog of standard deviation for heavy-tailed inputs where the mean and variance overstate central tendency and spread. The result implements com.eignex.kumulant.core.HasCenterScale (with center = median, scale = mad), so the standardize / band projections work on it directly.

Paired

PairedSumStat is the analogue of SumStat for paired streams: tracks per-axis sums of (x, y) updates. Not a regression; covariance and correlation live in com.eignex.kumulant.stat.regression.CovarianceStat.

Compose patterns

A few recurring patterns built from this family rather than as dedicated stats:

  • Fraction meeting a threshold: for SLO compliance and error budgets. Compose Mean.transform(IfExpr(X gt threshold, 1.0, 0.0)).windowed(window). Mean over the Bernoulli predicate is exactly the matched fraction.

  • Lag-k autocorrelation; Covariance.withSelfLag(k) self-pairs each input with the value seen k updates ago; the Pearson correlation falls out of the running covariance.

  • Standardised input: feed any series stat through the StandardScalerSeries spec or its modality siblings; the scaler reads center and scale off a VarianceStat / MomentsStat / MadStat / SummaryStat primary on every update.

Merge

Every entry merges across parallel workers. The additive stats (SumStat, CountStat, TotalWeightsStat, BernoulliSumStat) sum cell-wise; MinStat, MaxStat, and RangeStat take cell-wise min/max; the Welford family (MeanStat, VarianceStat, MomentsStat, SummaryStat) uses the Chan-style parallel recurrence; PairedSumStat sums each axis. All of these are exact. MadStat is the one approximation: the t-digests carry no round-trippable state, so merge re-pushes the (median, MAD) pair as a single update.

Concurrency

StatMemoryUpdateConcurrency
SumStat, CountStat, TotalWeightsStat, BernoulliSumStatO(1)O(1)striped-additive under HighWrite, atomic otherwise
MinStat, MaxStat, RangeStatO(1)O(1)CAS loop on a single cell
MeanStatO(1)O(1)Welford-coupled; locked under Strict / HighWrite, racing under Relaxed
VarianceStat, SummaryStatO(1)O(1)Welford-coupled
MomentsStatO(1)O(1)Welford-coupled (four cells)
MadStatO(t-digest compression)O(log compression)t-digest self-serialises
PairedSumStatO(1)O(1)two independent additive cells

Types

Link copied to clipboard
@Serializable
@SerialName(value = "BernoulliSumResult")
data class BernoulliSumResult(val successes: Double, val trials: Double) : Result

Sufficient statistics for Beta-Binomial inference: weighted successes and weighted trials. The Beta-Binomial conjugate posterior takes alpha = successes + alpha_0, beta = trials - successes + beta_0.

Link copied to clipboard
class BernoulliSumStat(val concurrency: Concurrency = Concurrency.None) : SeriesStat<BernoulliSumResult>

Accumulates (Sum w_i*x_i, Sum w_i) where each update's value is interpreted as a Bernoulli success indicator (typically 0 or 1; soft probabilities work too).

Link copied to clipboard
@Serializable
@SerialName(value = "CountResult")
data class CountResult(val count: Long) : Result

Unweighted event count.

Link copied to clipboard
class CountStat(concurrency: Concurrency = Concurrency.None) : SeriesStat<SumResult>

Observation count: each update contributes 1 regardless of supplied value and weight.

Link copied to clipboard
@Serializable
@SerialName(value = "MadResult")
data class MadResult(val median: Double, val mad: Double) : Result, HasCenterScale

Streaming median and median absolute deviation.

Link copied to clipboard
class MadStat(val compression: Double = 100.0, val concurrency: Concurrency = Concurrency.None) : SeriesStat<MadResult>

Streaming median absolute deviation, the robust analog of standard deviation. Backed by two TDigestStats: one over raw values (for the running median estimate), one over |value - median| (for the MAD itself). The deviation digest is fed against the running median estimate at each update; early observations therefore see a biased median, so the MAD takes ~tens to ~hundreds of updates to stabilise.

Link copied to clipboard
@Serializable
@SerialName(value = "MaxResult")
data class MaxResult(val max: Double) : Result

Running maximum of a stream.

Link copied to clipboard
class MaxStat(val concurrency: Concurrency = Concurrency.None) : SeriesStat<MaxResult>

Tracks the maximum value seen across a stream.

Link copied to clipboard
@Serializable
@SerialName(value = "MeanResult")
data class MeanResult(val mean: Double) : Result

Arithmetic mean.

Link copied to clipboard
class MeanStat(val concurrency: Concurrency = Concurrency.None) : SeriesStat<WeightedMeanResult>

Weighted arithmetic mean via Welford-style online update.

Link copied to clipboard
@Serializable
@SerialName(value = "MinResult")
data class MinResult(val min: Double) : Result

Running minimum of a stream.

Link copied to clipboard
class MinStat(val concurrency: Concurrency = Concurrency.None) : SeriesStat<MinResult>

Tracks the minimum value seen across a stream.

Link copied to clipboard
@Serializable
@SerialName(value = "MomentsResult")
data class MomentsResult(val totalWeights: Double, val mean: Double, val m2: Double, val m3: Double, val m4: Double) : Result, HasSampleVariance, HasShapeMoments, HasCenterScale

First four central moments (m2..m4) plus mean and total weight.

Link copied to clipboard
class MomentsStat(val concurrency: Concurrency = Concurrency.None) : SeriesStat<MomentsResult>

Weighted first four central moments (mean, m2, m3, m4) for skewness and kurtosis.

Link copied to clipboard
@Serializable
@SerialName(value = "PairedSumResult")
data class PairedSumResult(val totalWeights: Double, val sumX: Double, val sumY: Double) : Result

Paired weighted-sum snapshot: Sum w_i*x_i, Sum w_i*y_i, and Sum w_i.

Link copied to clipboard
class PairedSumStat(val concurrency: Concurrency = Concurrency.None) : PairedStat<PairedSumResult>

Weighted paired sum (Sum w_i*x_i, Sum w_i*y_i) with accumulated weight.

Link copied to clipboard
@Serializable
@SerialName(value = "RangeResult")
data class RangeResult(val min: Double, val max: Double) : Result, HasMinMax

Running min/max pair of a stream.

Link copied to clipboard
class RangeStat(val concurrency: Concurrency = Concurrency.None) : SeriesStat<RangeResult>

Tracks the minimum and maximum value seen across a stream.

Link copied to clipboard
@Serializable
@SerialName(value = "SummaryResult")
data class SummaryResult(val totalWeights: Double, val mean: Double, val variance: Double, val min: Double, val max: Double) : Result, HasSampleVariance, HasCenterScale, HasMinMax

Snapshot exposing running mean, variance, min, and max simultaneously.

Link copied to clipboard
class SummaryStat(val concurrency: Concurrency = Concurrency.None) : SeriesStat<SummaryResult>

Comprehensive summary stat; Welford mean/variance plus monotonic min/max in one accumulator. The result implements both HasCenterScale (center=mean, scale=stdDev) and HasMinMax (min, max), so feedback projections can address Center/Scale and Low/High on the same primary. Use it when a per-coordinate primary fan-out needs to support both standardisation and min-max scaling.

Link copied to clipboard
@Serializable
@SerialName(value = "SumResult")
data class SumResult(val sum: Double) : Result

Weighted sum snapshot.

Link copied to clipboard
class SumStat(val concurrency: Concurrency = Concurrency.None) : SeriesStat<SumResult>

Weighted sum Sum value*weight over the stream.

Link copied to clipboard
class TotalWeightsStat(concurrency: Concurrency = Concurrency.None) : SeriesStat<SumResult>

Sum of per-update weights; i.e. the effective sample size.

Link copied to clipboard
@Serializable
@SerialName(value = "VarianceResult")
data class VarianceResult(val mean: Double, val variance: Double) : Result

Mean and population variance.

Link copied to clipboard
class VarianceStat(val concurrency: Concurrency = Concurrency.None) : SeriesStat<WeightedVarianceResult>

Weighted mean and variance via Welford with Chan-style parallel merge.

Link copied to clipboard
@Serializable
@SerialName(value = "WeightedMeanResult")
data class WeightedMeanResult(val totalWeights: Double, val mean: Double) : Result

Weighted mean and accumulated weight.

Link copied to clipboard
@Serializable
@SerialName(value = "WeightedVarianceResult")
data class WeightedVarianceResult(val totalWeights: Double, val mean: Double, val variance: Double) : Result, HasSampleVariance, HasCenterScale

Weighted mean and variance with totalWeights for merge arithmetic.