com.eignex.kumulant/stat/quantile

stat.quantile

Bounded-memory quantile estimators and histograms. Every entry trades a different precision-versus-cost knob: relative error guarantees, fixed-precision over a known range, reservoir sampling for raw values back, or constant memory at the cost of accuracy.

Picking a quantile estimator

Stat	Memory	Precision	Reach for it when
DDSketchStat	O(1 / relativeError)	Relative error guarantee	Latencies, payload sizes, any value spanning orders of magnitude. Merge across replicas is exact. The default percentile sketch.
TDigestStat	O(compression)	Tighter tail-quantile error than DDSketch at the same memory budget	You specifically care about the 99th / 99.9th percentile (tails) and not the body.
HdrHistogramStat	O(precision · log(range))	Strictest precision in a bounded range	The value range is known up front (e.g. latencies between 1 µs and 1 hr) and you want guaranteed precision in that range.
LinearHistogramStat	O(binCount)	Equal-width bin precision	Meaningful breakpoints are known up front; you want bins that match them directly with no rebucketing on read.
ReservoirHistogramStat	O(capacity)	Raw values back	Downstream needs the actual observations (to feed another stat or compute quantities the sketches don't expose).
FrugalQuantileStat	O(1); two variables	Coarse, single-quantile	You can fit only a few bytes per stat and only care about one percentile.
ThresholdBucketStat	O(thresholds)	Caller-supplied edges	You know the meaningful value buckets ahead of time and want per-bucket counts, not a quantile estimate.

Result shapes

Result	Shape
SketchResult	DDSketch snapshot: log-spaced bin map + precomputed quantiles at the configured probabilities
QuantileResult	FrugalQuantileStat single-quantile scalar
TDigestResult	t-digest centroids + precomputed quantiles
SparseHistogramResult	Parallel `[lowerBounds, upperBounds)` arrays with weights; produced by HdrHistogramStat, LinearHistogramStat, and SketchResult.toSparseHistogram
ReservoirResult	Bounded reservoir sample of raw values + the sampling weight
ThresholdBucketResult	Per-bucket weighted counts over caller-supplied edges

SketchResult / TDigestResult / ReservoirResult all expose quantiles at the configured probabilities, so the result type a downstream consumer sees depends on which sketch was picked. For a uniform downstream interface, project to SparseHistogramResult (the shared histogram shape).

PIT-style equiprobable histogram

The pitHistogram(numBins) factory in com.eignex.kumulant.stat.score is built from this family: a stream of PIT values (which are uniform under correct distributional forecasts) fed into an equiprobable LinearHistogramStat over [0, 1] exposes the deviation from uniformity that the corresponding PIT test consumes.

Merge

DDSketch, HDR, t-digest merge exactly across replicas via cell-wise bin addition / centroid combination.
LinearHistogram, ThresholdBucket merge exactly via cell-wise bin addition (same bin layout required).
ReservoirHistogram merges sample-weighted via reservoir union: the result is statistically equivalent to one large reservoir.
FrugalQuantile does not have a clean merge: it averages the two point estimates. Use it for single-stream tracking, not distributed aggregation.

Concurrency

Histogram-shaped stats (DDSketchStat, HdrHistogramStat, LinearHistogramStat, ThresholdBucketStat) decompose updates into a single striped atomic increment on the destination bin; exact under every com.eignex.kumulant.core.Concurrency level. ReservoirHistogramStat and FrugalQuantileStat keep coupled state and self-serialise under concurrent access. TDigestStat self-serialises through its own lock.

Types

DDSketchStat

class DDSketchStat(val relativeError: Double = 0.01, val probabilities: DoubleArray = doubleArrayOf( 0.5, 0.75, 0.9, 0.95, 0.99, 0.999, ), val concurrency: Concurrency = Concurrency.None) : SeriesStat<SketchResult>

DDSketchStat: relative-error quantile sketch with logarithmic bins.

FrugalQuantileStat

class FrugalQuantileStat(val q: Double, val stepSize: Double = 0.01, val initialEstimate: Double = 0.0, val concurrency: Concurrency = Concurrency.None) : SeriesStat<QuantileResult>

Frugal-streaming single-quantile estimator.

HdrHistogramStat

class HdrHistogramStat(val lowestDiscernibleValue: Double = 0.001, val initialHighestTrackableValue: Double = 100.0, val significantDigits: Int = 3, val concurrency: Concurrency = Concurrency.None) : SeriesStat<SparseHistogramResult>

Auto-resizing High Dynamic Range (HDR) Histogram with native Double support.

LinearHistogramStat

class LinearHistogramStat(val lowerBound: Double, val upperBound: Double, val binCount: Int, val concurrency: Concurrency = Concurrency.None) : SeriesStat<SparseHistogramResult>

Fixed-width binned histogram over [lowerBound, upperBound) split into binCount buckets.

QuantileResult

@Serializable

@SerialName(value = "QuantileResult")

data class QuantileResult(val probability: Double, val quantile: Double) : Result

Single estimated quantile with the probability it targets.

ReservoirHistogramStat

class ReservoirHistogramStat(val capacity: Int = 1024, val seed: Long = Random.Default.nextLong(), val concurrency: Concurrency = Concurrency.None) : SeriesStat<ReservoirResult>

Weighted reservoir sample of size capacity via Algorithm A-Res (Efraimidis & Spirakis): each item gets a key u^(1/w) and the top-k keys are retained, giving an unbiased weight-proportional sample.

ReservoirResult

@Serializable

@SerialName(value = "ReservoirResult")

data class ReservoirResult(val values: DoubleArray, val keys: DoubleArray, val capacity: Int, val totalSeen: Long, val totalWeight: Double) : Result

Reservoir sampling snapshot.

SketchResult

@Serializable

@SerialName(value = "SketchResult")

data class SketchResult(val probabilities: DoubleArray, val quantiles: DoubleArray, val gamma: Double, val totalWeights: Double, val zeroCount: Double, val positiveBins: Map<Int, Double>, val negativeBins: Map<Int, Double>) : Result

DDSketch snapshot: logarithmic bins plus precomputed quantiles for probabilities.

SparseHistogramResult

@Serializable

@SerialName(value = "SparseHistogramResult")

data class SparseHistogramResult(val lowerBounds: DoubleArray, val upperBounds: DoubleArray, val weights: DoubleArray) : Result

Histogram as parallel [lowerBounds, upperBounds) bucket arrays with weights.

TDigestResult

@Serializable

@SerialName(value = "TDigestResult")

data class TDigestResult(val probabilities: DoubleArray, val quantiles: DoubleArray, val means: DoubleArray, val weights: DoubleArray, val totalWeight: Double, val compression: Double) : Result

T-digest snapshot: means/weights are the centroid arrays sorted by mean, with quantiles precomputed for probabilities via CDF inversion.

TDigestStat

class TDigestStat(val compression: Double = 100.0, val probabilities: DoubleArray = doubleArrayOf(0.5, 0.75, 0.9, 0.95, 0.99, 0.999), val concurrency: Concurrency = Concurrency.None) : SeriesStat<TDigestResult>

Buffered merging T-Digest (Dunning) with k1 scaling function for high-fidelity extreme-quantile estimates and bounded centroid count. compression (delta) caps centroids to roughly ~6*delta.

ThresholdBucketResult

@Serializable

@SerialName(value = "ThresholdBucketResult")

data class ThresholdBucketResult(val thresholds: List<Double>, val counts: List<Double>) : Result

Per-bucket counts for a user-defined threshold list. For thresholds [t1, t2, ..., tK] (strictly increasing) the result holds K + 1 counts; bucket i contains t[i-1] < value <= t[i] with the open-ended ends value <= t[0] and value > t[K-1].

ThresholdBucketStat

class ThresholdBucketStat(val thresholds: DoubleArray, val concurrency: Concurrency = Concurrency.None) : SeriesStat<ThresholdBucketResult>

Weighted counter over user-defined value buckets.

Functions

quantile

fun ReservoirResult.quantile(probability: Double): Double

Linear-interpolated quantile at probability from a reservoir sample (treats sample as unweighted).

toSparseHistogram

fun SketchResult.toSparseHistogram(): SparseHistogramResult

Project a SketchResult into a SparseHistogramResult by expanding its bin indices to bucket boundaries.

fun TDigestResult.toSparseHistogram(): SparseHistogramResult

Convert centroids to a sparse histogram with bins centered on each centroid.

fun ReservoirResult.toSparseHistogram(binCount: Int): SparseHistogramResult

Bucket the retained sample into binCount equal-width bins between min and max.