com.eignex.kumulant/stat/regression/SoftmaxRegressionStat

SoftmaxRegressionStat

class SoftmaxRegressionStat(val featureSize: Int, val numClasses: Int, val optimizer: OptimizerSpec = Sgd(), val biasOptimizer: OptimizerSpec = optimizer, val concurrency: Concurrency = Concurrency.None) : RegressionStat<SoftmaxRegressionResult> (source)

Online multinomial logistic regression by stochastic gradient descent on the softmax cross-entropy loss. Generalises com.eignex.kumulant.stat.regression.glm.StochasticRegressionStat with Link.Logit from binary to K-way classification.

Update step per observation (true class c = y.toInt()):

p[k]      = softmax(W[k] . x + b[k])
grad[k]i  = (p[k] - 1[k == c]) * x[i]    // per-coordinate gradient
W[k]i    += optimizer.computeDelta(k, i, grad[k]i, weight)
b[k]     += biasOptimizer.computeDelta(k, p[k] - 1[k == c], weight)

One OptimizerSpec is materialised per class for the weight matrix; bias is a single optimizer over numClasses slots.

Memory: O(numClasses * featureSize) for weights + per-optimizer aux state. Update: O(numClasses * nnz(x)) per observation. Concurrency: Welford-locked; the optimizer aux state honours the same Concurrency passed in.

Constructors

SoftmaxRegressionStat

constructor(featureSize: Int, numClasses: Int, optimizer: OptimizerSpec = Sgd(), biasOptimizer: OptimizerSpec = optimizer, concurrency: Concurrency = Concurrency.None)(source)

Properties

biasOptimizer

val biasOptimizer: OptimizerSpec(source)

Bias optimizer, materialised once over numClasses slots. Defaults to optimizer.

concurrency

open override val concurrency: Concurrency(source)

The thread-safety contract this stat was constructed with. Each stat picks the cell-encoding and lock strategy that honours this contract for its mathematical structure:

Concurrency.None: single-threaded; no synchronisation. Cheapest path.
Concurrency.Relaxed: lock-free best-effort. Multi-cell stats (Welford-style MeanStat, VarianceStat, MomentsStat) may drift under contention but never throw.
Concurrency.Strict: serialised when needed for full correctness across coupled cells. Sketches always self-serialise; Welford stats lock per update.
Concurrency.HighWrite: optimised for many concurrent writers; JVM uses striped adders for naively additive stats.

Picked at construction; immutable after.

crossEntropy

val crossEntropy: Double(source)

Live view of the accumulated weighted cross-entropy.

featureSize

open override val featureSize: Int(source)

Number of features expected in x on each update. Mismatched lengths throw.

numClasses

val numClasses: Int(source)

Number of classes; the input y must round to [0, numClasses).

optimizer

val optimizer: OptimizerSpec(source)

Per-class weight-matrix optimizer; one instance is materialised per class.

step

val step: Long(source)

Live view of the per-observation step counter.

totalWeights

val totalWeights: Double(source)

Live view of the cumulative observation weight folded in.

Functions

create

open override fun create(concurrency: Concurrency? = null): SoftmaxRegressionStat(source)

Spawn a fresh accumulator with the same configuration. Optionally override the Concurrency; useful for materialising a wire spec at a different concurrency level than the source.

The returned stat is independent: its state starts at the configured baseline, not at the source's current state. Each modality subtype narrows the return type so chaining doesn't lose the modality.

merge

open override fun merge(values: SoftmaxRegressionResult)(source)

Fold another accumulator's snapshot into this one. The unit of merge is the immutable Result; not a live Stat; which is what lets the merge cross a process boundary. Many workers track slices of the same stream, call read periodically, ship snapshots to a coordinator, and the coordinator merges them in.

Most stat families implement merge exactly (Chan-style parallel formulas for Welford, cell-wise additions for histograms, cell-wise max for HLL). SGD-based regressors merge approximately; they have no second-moment information for the principled combine. Each stat's KDoc documents its merge semantics.

read

open override fun read(timestampNanos: Long = currentTimeNanos()): SoftmaxRegressionResult(source)

Materialise the current state as an immutable Result. Reads never mutate, so the caller can read as often as it likes without affecting the stream.

Snapshot consistency depends on the configured Concurrency. Under Concurrency.Strict / Concurrency.HighWrite a read locks against writers so coupled cells stay consistent. Under Concurrency.Relaxed the cells race and the snapshot may drift by ULPs of the workload under heavy contention; the drift is bounded and the read never throws.

timestampNanos is the read timestamp. Stats that don't care about time silently drop it; stats that do (rates, decay families, recency, windowed wrappers) use it as the ordering signal.

reset

open override fun reset()(source)

Reset the stat to its prior-seeded baseline. Equivalent to constructing a fresh stat with the same configuration, but in place; keeps the same Concurrency and any per-stat tunables.

update

open fun update(x: VectorView, y: Double, weight: Double = 1.0)

Record an (x, y) observation with the given weight at the current time.

open fun update(x: DoubleArray, y: Double, weight: Double = 1.0)

Convenience overload that wraps x as a DenseVector.

open fun update(x: DoubleArray, y: Double, timestampNanos: Long, weight: Double = 1.0)

Timestamped convenience overload that wraps x as a DenseVector.

open override fun update(x: VectorView, y: Double, timestampNanos: Long, weight: Double = 1.0)

Record an (x, y) observation at timestampNanos with the given weight.

update

open override fun update(x: VectorView, y: Double, timestampNanos: Long, weight: Double = 1.0)(source)

Record an (x, y) observation at timestampNanos with the given weight.