kumulant

Adagrad

@Serializable
@SerialName(value = "Adagrad")
data class Adagrad(val learningRate: ScalarExpr = ConstantRate(0.01), val epsilon: Double = 1.0E-10) : OptimizerSpec(source)

Adagrad. Per-coordinate adaptive learning rate via accumulated squared gradients: w[i] -= lr * grad[i] / sqrt(sumG2[i] + epsilon).

Reach for Adagrad when feature occurrence is sparse and uneven; power-law-distributed categorical features, rarely-seen tokens, anything where you want rare features to take big steps and common features to settle into small ones. The accumulating denominator makes Adagrad's effective learning rate monotonically non-increasing, which is the limitation Rmsprop addresses.

Constructors

Link copied to clipboard
constructor(learningRate: ScalarExpr = ConstantRate(0.01), epsilon: Double = 1.0E-10)

Properties

Link copied to clipboard

Numerical-stability epsilon added under the square root to keep the divisor non-zero.

Link copied to clipboard

Base learning rate, multiplied by the per-coord 1 / sqrt(sumG2 + eps) factor.

Functions

Link copied to clipboard
open override fun materialize(featureSize: Int, concurrency: Concurrency = Concurrency.None): Optimizer

Build a live optimizer instance over featureSize coordinates at the requested Concurrency. Each call returns a fresh optimizer with empty aux state; stats call this for each weight vector they want to track (one per output class for com.eignex.kumulant.stat.regression.SoftmaxRegressionStat).

Adagrad

constructor(learningRate: ScalarExpr = ConstantRate(0.01), epsilon: Double = 1.0E-10)(source)

epsilon

Numerical-stability epsilon added under the square root to keep the divisor non-zero.

learningRate

Base learning rate, multiplied by the per-coord 1 / sqrt(sumG2 + eps) factor.

materialize

open override fun materialize(featureSize: Int, concurrency: Concurrency = Concurrency.None): Optimizer(source)

Build a live optimizer instance over featureSize coordinates at the requested Concurrency. Each call returns a fresh optimizer with empty aux state; stats call this for each weight vector they want to track (one per output class for com.eignex.kumulant.stat.regression.SoftmaxRegressionStat).