Optimizer
Online optimizer strategy. Given a per-coordinate raw gradient, returns the delta to add to that weight cell. Owns any per-coordinate auxiliary state (Adam's first/second moments, Adagrad's running squared gradient, etc.).
Lifecycle, called by the host stat once per update:
advance; bump per-update counters (Adam's step
t).For each touched coordinate, computeDelta; return the weight delta.
The host stat applies the delta to its weight cell.
Stateless optimizers (Sgd) ignore advance. Concurrency: per-coordinate aux state honours the Concurrency passed at materialization; multi-cell coupled state (Adam) uses Welford locking semantics.
Inheritors
advance
computeDelta
featureSize
Number of weight coordinates this optimizer manages.