kumulant

stat.regression.glm

Generalised linear models; the linear-predictor-plus-link family. Every model in this package shares the shape eta = bias + x . weights followed by mu = link.invMean(eta); they differ in how the posterior over weights is maintained.

The link family

Link is the canonical GLM link. Three variants ship:

All three are canonical links: the gradient simplifies to (mu - y) * x. Non-canonical links are not exposed because that shortcut no longer holds and the per-update cost grows. Add the link itself if you need one (the curvature method is the only thing tightly coupled to the Bayesian variants).

Picking a model

StatWhen to reach for it
UnivariateRegressionStatOne scalar feature, one scalar response. OLS / ridge / lasso via Penalty. The cheapest entry; reach for it for "fit a line to a stream of (x, y) points."
StochasticRegressionStatOnline SGD with any OptimizerSpec (Sgd / Adagrad / RMSProp / Adam). Best when you want point estimates only and the per-update cost must stay small.
DiagonalRegressionStatFactorised Gaussian posterior; per-coefficient precision without the full covariance matrix. The natural choice for high-dimensional features where the quadratic memory of full Bayesian regression is prohibitive.
BayesianRegressionStatFull Gaussian posterior with covariance matrix and Cholesky factor. Closed-form under Link.Identity; online Laplace approximation under Link.Logit / Link.Log. Reach for it when downstream needs uncertainty quantification; Thompson sampling, LinUCB.
HierarchicalBayesianRegressionPooled estimation across many parallel regressors. Use it when you have one regressor per arm in a bandit (or per group in any stratified problem) and want them to share strength.

Penalties

Penalty.L1 and Penalty.L2 are the standard regularisers. They attach to StochasticRegressionStat (only) and use the existing lazy-update tricks for sparse efficiency (Bottou multiplicative scaling for L2, cumulative truncated gradient for L1). The constructor checks that a non-None Penalty is paired with com.eignex.kumulant.schema.Sgd optimizers; pairing penalties with Adam / Adagrad / RMSProp is not supported (folding regularisation into adaptive-method weight decay is delicate and would surface as a different config).

UnivariateRegressionStat supports Penalty.L1 / Penalty.L2 directly; the closed-form OLS / lasso / ridge math falls out of the running covariance.

Posteriors

LinearPosterior adapters turn a LinearRegressionResult into a scalar score:

The posteriors live here (not in bandit/) because they're properties of the model, not the bandit. The bandit consumes them.

Learning-rate schedules

ConstantRate, StepDecay, ExponentialDecay, and friends in LearningRates.kt are wire-portable com.eignex.kumulant.schema.ScalarExpr expressions that produce a learning rate from the current step counter. Wrap them in com.eignex.kumulant.schema.Sgd (or any other com.eignex.kumulant.schema.OptimizerSpec) to pass through the wire.

Merge

UnivariateRegressionStat merges exactly via Chan-style parallel Welford on its running covariance. DiagonalRegressionStat and BayesianRegressionStat merge exactly by combining Gaussian posteriors: per-coordinate precision-weighted for the diagonal, full precision add-and-downdate (with Cholesky recomputation) for the full covariance. StochasticRegressionStat merges approximately; SGD keeps no second-moment information, so the combine is a sample-weighted average of the weight vectors. HierarchicalBayesianRegression does not merge; the manager refits the population prior from its tracked instances instead.

Concurrency

UnivariateRegressionStat is Welford-coupled: locked under com.eignex.kumulant.core.Concurrency.Strict / com.eignex.kumulant.core.Concurrency.HighWrite, racing with bounded drift under com.eignex.kumulant.core.Concurrency.Relaxed. StochasticRegressionStat runs lock-free asynchronous SGD (Hogwild) under com.eignex.kumulant.core.Concurrency.Relaxed and serialises the update body under com.eignex.kumulant.core.Concurrency.Strict / com.eignex.kumulant.core.Concurrency.HighWrite. DiagonalRegressionStat and BayesianRegressionStat serialise the whole update under any concurrent level (the coupled posterior update has no lock-free form); throughput is contention-bound, so shard and merge for higher write rates. HierarchicalBayesianRegression is a manager, not a hot-path stat; each tracked BayesianRegressionStat honours its own level.

Types

Link copied to clipboard
class BayesianRegressionStat(val featureSize: Int, val priorVariance: Double = 1.0, val link: Link = Link.Identity, val concurrency: Concurrency = Concurrency.None, priorMean: VectorView? = null, priorCovariance: MatrixView? = null) : RegressionStat<CovarianceRegressionResult>

Bayesian generalised linear regression with a Gaussian prior on the weights and a canonical Link for the response. Produces a full posterior covariance S = H^-1 alongside the point estimates. Suitable for Thompson-sampling-style bandits drawing a fresh weight vector from N(weights, exploration * S) per round.

Link copied to clipboard
@Serializable
@SerialName(value = "CovarianceRegressionResult")
data class CovarianceRegressionResult(val weights: DenseVector, val bias: Double, val biasPrecision: Double, val totalWeights: Double, val step: Long, val covariance: DenseMatrix, val covarianceL: DenseMatrix, val link: Link = Link.Identity, val sse: Double = 0.0) : LinearRegressionResult

Full multivariate-Gaussian posterior. Carries the joint covariance and its lower-triangular Cholesky factor L so samplers can draw w ~ N(mean, cov) as mean + L u, u ~ N(0, I) without redoing the decomposition.

Link copied to clipboard
@Serializable
@SerialName(value = "DiagonalRegressionResult")
data class DiagonalRegressionResult(val weights: DenseVector, val bias: Double, val biasPrecision: Double, val totalWeights: Double, val step: Long, val precision: DenseVector, val link: Link = Link.Identity, val sse: Double = 0.0) : LinearRegressionResult

Factorised posterior: each coefficient has its own precision (= 1/variance) but coefficients are assumed independent. Cheap to maintain and sample from; ignores correlations between features.

Link copied to clipboard
class DiagonalRegressionStat(val featureSize: Int, val priorPrecision: Double = 1.0, val learningRate: ScalarExpr = ConstantRate(1.0), val penalty: Penalty = Penalty.None, val link: Link = Link.Identity, val concurrency: Concurrency = Concurrency.None) : RegressionStat<DiagonalRegressionResult>

Generalised linear regression with a factorised Gaussian posterior - each coefficient gets its own running precision, but cross-coefficient correlations are dropped.

Link copied to clipboard
@Serializable
@SerialName(value = "FactorisedGaussian")
data object FactorisedGaussian : LinearPosterior<DiagonalRegressionResult>

Per-coordinate Gaussian: each w_i ~ N(weights[i], exploration / precision[i]). Cheap O(n) draws; ignores cross-feature correlations.

Link copied to clipboard
class HierarchicalBayesianRegression(val featureSize: Int, val link: Link = Link.Identity, val biasPriorVariance: Double = 1.0, val concurrency: Concurrency = Concurrency.None, initialPriorVariance: Double = 1.0, initialPriorMean: DenseVector? = null, initialPriorCovariance: DenseMatrix? = null)

Manager for a population of BayesianRegressionStat instances that share an empirical-Bayes prior. New instances inherit the current populationPrior; periodic refit re-fits the prior from the current per-instance posteriors. Cross-instance transfer happens through that prior; older instances keep accumulating their own state, but freshly created instances borrow from the population's collective experience.

Link copied to clipboard
@Serializable
sealed interface LinearPosterior<R : LinearRegressionResult> : RegressionPosterior<R>

Stateless multivariate sampler over a LinearRegressionResult snapshot. Reads the weights (and whichever uncertainty fields the concrete snapshot carries) and returns a fresh weight-vector draw, scaled by exploration so callers can dial the posterior variance per round.

Link copied to clipboard
@Serializable
sealed interface LinearRegressionResult : Result, HasLinearModel, HasRegression

Common shape across multivariate-x linear regression snapshots.

Link copied to clipboard
@Serializable
sealed interface Link

Canonical GLM link function. Encodes everything that varies between Gaussian linear regression and its GLM siblings:

Link copied to clipboard
@Serializable
@SerialName(value = "LinUcb")
data object LinUcb : LinearPosterior<CovarianceRegressionResult>

LinUCB-style confidence-bound scoring: predict(x) + exploration * sqrt(xT * Sigma * x). Deterministic given the snapshot; no random draw at evaluate time; so the exploration parameter here plays the role of LinUCB's alpha (confidence-bound width), not the variance scale used by Thompson-style posteriors. sample returns the snapshot's mean weights since UCB has no per-arm randomization; callers that want sampled weights should pair with MultivariateGaussian instead.

Link copied to clipboard
@Serializable
@SerialName(value = "MultivariateGaussian")
data object MultivariateGaussian : LinearPosterior<CovarianceRegressionResult>

Full multivariate-Gaussian draw w ~ N(weights, exploration * Sum) via the pre-computed Cholesky factor L carried in the snapshot:

Link copied to clipboard
@Serializable
sealed interface Penalty

Regularisation knob shared across regression stats. The mathematical effect is always the same; the mechanics differ per host: a closed-form read()-time projection in UnivariateRegressionStat, lazy multiplicative scaling / truncated-gradient in StochasticRegressionStat, per-coordinate proximal step (L1) or gradient term (L2) in DiagonalRegressionStat. See each stat's KDoc.

Link copied to clipboard
@Serializable
@SerialName(value = "PointPosterior")
data object PointPosterior : LinearPosterior<StochasticRegressionResult>

Point estimate plus optional isotropic Gaussian noise. SGD models have no posterior variance to draw from; exploration adds a constant std-dev shake on top of the point estimate.

Link copied to clipboard
@Serializable
data class PopulationPrior(val mean: DenseVector, val covariance: DenseMatrix, val instanceCount: Int)

Empirical-Bayes prior fitted across a population of related regression posteriors. Hand mean and covariance straight to BayesianRegressionStat's priorMean / priorCovariance constructor parameters to seed a new instance.

Link copied to clipboard
@Serializable
@SerialName(value = "StochasticRegressionResult")
data class StochasticRegressionResult(val weights: DenseVector, val bias: Double, val totalWeights: Double, val step: Long, val link: Link = Link.Identity, val sse: Double = 0.0, val updaterState: List<VectorView> = emptyList()) : LinearRegressionResult

SGD weight estimates with no posterior. Cheap, no uncertainty quantification. sse carries the accumulated per-link loss; for Link.Identity this is the classical SSE, for Link.Logit / Link.Log it is the GLM deviance (negative log-likelihood). HasRegression.mse / HasRegression.rmse / HasRegression.rSquared are only natural under Identity.

Link copied to clipboard
class StochasticRegressionStat(val featureSize: Int, val optimizer: OptimizerSpec = Sgd(), val biasOptimizer: OptimizerSpec = optimizer, val penalty: Penalty = Penalty.None, val link: Link = Link.Identity, val concurrency: Concurrency = Concurrency.None) : RegressionStat<StochasticRegressionResult>

Online generalised linear regression by stochastic gradient descent on the canonical Link's negative log-likelihood plus optional Penalty. The cheapest of the multivariate regressors; point estimates only, no posterior, fast updates.

Link copied to clipboard
@Serializable
@SerialName(value = "UnivariateRegressionResult")
data class UnivariateRegressionResult(val penalty: Penalty, val totalWeights: Double, val slope: Double, val intercept: Double, val sse: Double, val sxy: Double, val x: VarianceResult, val y: VarianceResult) : Result, HasSlope, HasRegression

Fitted univariate least-squares regression y = slope * x + intercept. Carries the marginal x / y variances and the raw weighted cross-deviation sxy so the result round-trips losslessly under merge regardless of penalty.

Link copied to clipboard
class UnivariateRegressionStat(val penalty: Penalty = Penalty.None, val concurrency: Concurrency = Concurrency.None) : PairedStat<UnivariateRegressionResult>

Online univariate linear regression backed by Chan's parallel Welford accumulator on (x, y). A single hot path drives every Penalty: accumulation is identical; the penalty's closed-form projection is applied only at read.

Functions

Link copied to clipboard
fun ConstantRate(eta: Double = 0.001): ScalarExpr

Constant rate eta.

Link copied to clipboard
fun ExponentialDecay(eta: Double = 0.01, k: Double = 1.0E-5): ScalarExpr

eta * exp(-k * step).

Link copied to clipboard
fun StepDecay(eta: Double = 0.01, k: Double = 0.001): ScalarExpr

eta / (1 + k * step).