kumulant

GaussianNaiveBayesStat

class GaussianNaiveBayesStat(val featureSize: Int, val numClasses: Int, val varianceFloor: Double = 1.0E-9, val concurrency: Concurrency = Concurrency.None) : RegressionStat<GaussianNaiveBayesResult> (source)

Online Gaussian Naive Bayes classifier. Tracks per-class, per-feature running mean and variance via weighted Welford, plus per-class accumulated weight as the prior. Predict-time log-likelihoods assume features are conditionally independent within each class.

Use cases: cheap multiclass classifier with calibrated probabilities, useful as a baseline against SoftmaxRegressionStat or as a fallback for sparse / high-cardinality feature spaces where SGD is slow to converge.

Memory: O(numClasses * featureSize); three flat cells per (class, feature) pair (mean, M2, totalWeights), plus a per-class weight.

Update: O(featureSize) per observation (dense; sparse cost is the same because variance updates need to compare against zero).

Concurrency: Welford-locked under Concurrency.Strict / Concurrency.HighWrite. The per-class Welford state is coupled across cells, so Concurrency.None skips synchronisation; Concurrency.Relaxed runs without the lock and may drift across cells.

Constructors

Link copied to clipboard
constructor(featureSize: Int, numClasses: Int, varianceFloor: Double = 1.0E-9, concurrency: Concurrency = Concurrency.None)

Properties

Link copied to clipboard
open override val concurrency: Concurrency

The thread-safety contract this stat was constructed with. Each stat picks the cell-encoding and lock strategy that honours this contract for its mathematical structure:

Link copied to clipboard
open override val featureSize: Int

Number of features expected in x on each update. Mismatched lengths throw.

Link copied to clipboard

Number of classes; the input y must round to [0, numClasses).

Link copied to clipboard

Lower bound applied to per-class variances when computing log-likelihoods. Prevents log(0) blow-ups on early or constant-feature data.

Functions

Link copied to clipboard
open override fun create(concurrency: Concurrency? = null): GaussianNaiveBayesStat

Spawn a fresh accumulator with the same configuration. Optionally override the Concurrency; useful for materialising a wire spec at a different concurrency level than the source.

Link copied to clipboard
open override fun merge(values: GaussianNaiveBayesResult)

Weight-pooled merge: combines per-class running means and M2 using Chan's parallel-Welford formula. Exact under weighted updates.

Link copied to clipboard
open override fun read(timestampNanos: Long = currentTimeNanos()): GaussianNaiveBayesResult

Materialise the current state as an immutable Result. Reads never mutate, so the caller can read as often as it likes without affecting the stream.

Link copied to clipboard
open override fun reset()

Reset the stat to its prior-seeded baseline. Equivalent to constructing a fresh stat with the same configuration, but in place; keeps the same Concurrency and any per-stat tunables.

Link copied to clipboard
open fun update(x: VectorView, y: Double, weight: Double = 1.0)

Record an (x, y) observation with the given weight at the current time.

open fun update(x: DoubleArray, y: Double, weight: Double = 1.0)

Convenience overload that wraps x as a DenseVector.

open fun update(x: DoubleArray, y: Double, timestampNanos: Long, weight: Double = 1.0)

Timestamped convenience overload that wraps x as a DenseVector.

open override fun update(x: VectorView, y: Double, timestampNanos: Long, weight: Double = 1.0)

Record an (x, y) observation at timestampNanos with the given weight.

GaussianNaiveBayesStat

constructor(featureSize: Int, numClasses: Int, varianceFloor: Double = 1.0E-9, concurrency: Concurrency = Concurrency.None)(source)

concurrency

open override val concurrency: Concurrency(source)

The thread-safety contract this stat was constructed with. Each stat picks the cell-encoding and lock strategy that honours this contract for its mathematical structure:

Picked at construction; immutable after.

create

open override fun create(concurrency: Concurrency? = null): GaussianNaiveBayesStat(source)

Spawn a fresh accumulator with the same configuration. Optionally override the Concurrency; useful for materialising a wire spec at a different concurrency level than the source.

The returned stat is independent: its state starts at the configured baseline, not at the source's current state. Each modality subtype narrows the return type so chaining doesn't lose the modality.

featureSize

open override val featureSize: Int(source)

Number of features expected in x on each update. Mismatched lengths throw.

merge

open override fun merge(values: GaussianNaiveBayesResult)(source)

Weight-pooled merge: combines per-class running means and M2 using Chan's parallel-Welford formula. Exact under weighted updates.

numClasses

Number of classes; the input y must round to [0, numClasses).

read

open override fun read(timestampNanos: Long = currentTimeNanos()): GaussianNaiveBayesResult(source)

Materialise the current state as an immutable Result. Reads never mutate, so the caller can read as often as it likes without affecting the stream.

Snapshot consistency depends on the configured Concurrency. Under Concurrency.Strict / Concurrency.HighWrite a read locks against writers so coupled cells stay consistent. Under Concurrency.Relaxed the cells race and the snapshot may drift by ULPs of the workload under heavy contention; the drift is bounded and the read never throws.

timestampNanos is the read timestamp. Stats that don't care about time silently drop it; stats that do (rates, decay families, recency, windowed wrappers) use it as the ordering signal.

reset

open override fun reset()(source)

Reset the stat to its prior-seeded baseline. Equivalent to constructing a fresh stat with the same configuration, but in place; keeps the same Concurrency and any per-stat tunables.

update

open override fun update(x: VectorView, y: Double, timestampNanos: Long, weight: Double = 1.0)(source)

Record an (x, y) observation at timestampNanos with the given weight.

varianceFloor

Lower bound applied to per-class variances when computing log-likelihoods. Prevents log(0) blow-ups on early or constant-feature data.