kumulant

StochasticRegressionStat

class StochasticRegressionStat(val featureSize: Int, val optimizer: OptimizerSpec = Sgd(), val biasOptimizer: OptimizerSpec = optimizer, val penalty: Penalty = Penalty.None, val link: Link = Link.Identity, val concurrency: Concurrency = Concurrency.None) : RegressionStat<StochasticRegressionResult> (source)

Online generalised linear regression by stochastic gradient descent on the canonical Link's negative log-likelihood plus optional Penalty. The cheapest of the multivariate regressors; point estimates only, no posterior, fast updates.

The per-coordinate update rule is owned by optimizer (Sgd / com.eignex.kumulant.schema.Adagrad / com.eignex.kumulant.schema.Rmsprop / com.eignex.kumulant.schema.Adam). The bias has its own biasOptimizer schedule because the intercept usually wants a different cadence than the coefficients.

Penalty.L1 and Penalty.L2 require optimizer (and biasOptimizer) to be Sgd; the lazy-update tricks they rely on (Bottou-style multiplicative scaling for L2; cumulative truncated gradient for L1) are SGD-specific. With a non-Sgd optimizer the penalty must be Penalty.None; folding L1/L2 into Adam-class updates is left for a future refactor.

Use cases: high-throughput online regression where point estimates suffice and the per-update cost must stay small. Reach for DiagonalRegressionStat when uncertainty is needed; for BayesianRegressionStat when the full posterior is needed.

Memory: O(featureSize); weights vector, bias, plus optimizer aux state.

Update: O(nnz(x)) per observation under Penalty.None; the L1/L2 paths add lazy-update bookkeeping with the same asymptotic cost.

Concurrency: Welford-coupled per-slot atomic under Concurrency.Relaxed (HOGWILD-style asynchronous SGD), serialised under Concurrency.Strict / Concurrency.HighWrite.

Constructors

Link copied to clipboard
constructor(featureSize: Int, optimizer: OptimizerSpec = Sgd(), biasOptimizer: OptimizerSpec = optimizer, penalty: Penalty = Penalty.None, link: Link = Link.Identity, concurrency: Concurrency = Concurrency.None)

Properties

Link copied to clipboard

Live view of the running intercept.

Link copied to clipboard

Update rule for the bias scalar. Defaults to optimizer.

Link copied to clipboard
open override val concurrency: Concurrency

The thread-safety contract this stat was constructed with. Each stat picks the cell-encoding and lock strategy that honours this contract for its mathematical structure:

Link copied to clipboard
open override val featureSize: Int

Number of features expected in x on each update. Mismatched lengths throw.

Link copied to clipboard
val link: Link

Canonical GLM link function; Link.Identity gives ordinary least-squares SGD.

Link copied to clipboard

Per-coordinate update rule for the weight vector.

Link copied to clipboard

Regularisation applied during the gradient step; requires Sgd optimizers.

Link copied to clipboard
val sse: Double

Live view of the accumulated per-link loss.

Link copied to clipboard
val step: Long

Live view of the per-observation step counter.

Link copied to clipboard

Live view of the cumulative observation weight folded in.

Functions

Link copied to clipboard
open override fun create(concurrency: Concurrency? = null): StochasticRegressionStat

Spawn a fresh accumulator with the same configuration. Optionally override the Concurrency; useful for materialising a wire spec at a different concurrency level than the source.

Link copied to clipboard
open override fun merge(values: StochasticRegressionResult)

Sample-weighted blend of weights and bias. SGD has no second-moment information, so this is an approximation; for principled merges use BayesianRegressionStat.

Link copied to clipboard
open override fun read(timestampNanos: Long = currentTimeNanos()): StochasticRegressionResult

Materialise the current state as an immutable Result. Reads never mutate, so the caller can read as often as it likes without affecting the stream.

Link copied to clipboard
open override fun reset()

Reset the stat to its prior-seeded baseline. Equivalent to constructing a fresh stat with the same configuration, but in place; keeps the same Concurrency and any per-stat tunables.

Link copied to clipboard
open fun update(x: VectorView, y: Double, weight: Double = 1.0)

Record an (x, y) observation with the given weight at the current time.

open fun update(x: DoubleArray, y: Double, weight: Double = 1.0)

Convenience overload that wraps x as a DenseVector.

open fun update(x: DoubleArray, y: Double, timestampNanos: Long, weight: Double = 1.0)

Timestamped convenience overload that wraps x as a DenseVector.

open override fun update(x: VectorView, y: Double, timestampNanos: Long, weight: Double = 1.0)

Record an (x, y) observation at timestampNanos with the given weight.

StochasticRegressionStat

constructor(featureSize: Int, optimizer: OptimizerSpec = Sgd(), biasOptimizer: OptimizerSpec = optimizer, penalty: Penalty = Penalty.None, link: Link = Link.Identity, concurrency: Concurrency = Concurrency.None)(source)

biasOptimizer

Update rule for the bias scalar. Defaults to optimizer.

bias

Live view of the running intercept.

concurrency

open override val concurrency: Concurrency(source)

The thread-safety contract this stat was constructed with. Each stat picks the cell-encoding and lock strategy that honours this contract for its mathematical structure:

Picked at construction; immutable after.

create

open override fun create(concurrency: Concurrency? = null): StochasticRegressionStat(source)

Spawn a fresh accumulator with the same configuration. Optionally override the Concurrency; useful for materialising a wire spec at a different concurrency level than the source.

The returned stat is independent: its state starts at the configured baseline, not at the source's current state. Each modality subtype narrows the return type so chaining doesn't lose the modality.

featureSize

open override val featureSize: Int(source)

Number of features expected in x on each update. Mismatched lengths throw.

merge

open override fun merge(values: StochasticRegressionResult)(source)

Sample-weighted blend of weights and bias. SGD has no second-moment information, so this is an approximation; for principled merges use BayesianRegressionStat.

optimizer

Per-coordinate update rule for the weight vector.

penalty

Regularisation applied during the gradient step; requires Sgd optimizers.

read

open override fun read(timestampNanos: Long = currentTimeNanos()): StochasticRegressionResult(source)

Materialise the current state as an immutable Result. Reads never mutate, so the caller can read as often as it likes without affecting the stream.

Snapshot consistency depends on the configured Concurrency. Under Concurrency.Strict / Concurrency.HighWrite a read locks against writers so coupled cells stay consistent. Under Concurrency.Relaxed the cells race and the snapshot may drift by ULPs of the workload under heavy contention; the drift is bounded and the read never throws.

timestampNanos is the read timestamp. Stats that don't care about time silently drop it; stats that do (rates, decay families, recency, windowed wrappers) use it as the ordering signal.

reset

open override fun reset()(source)

Reset the stat to its prior-seeded baseline. Equivalent to constructing a fresh stat with the same configuration, but in place; keeps the same Concurrency and any per-stat tunables.

sse

Live view of the accumulated per-link loss.

step

Live view of the per-observation step counter.

totalWeights

Live view of the cumulative observation weight folded in.

update

open override fun update(x: VectorView, y: Double, timestampNanos: Long, weight: Double = 1.0)(source)

Record an (x, y) observation at timestampNanos with the given weight.