NormalArm
Gaussian arm with a Normal-Gamma prior (unknown mean and variance). Tracks both the running mean and the sum of squared deviations, which gives NormalGammaPosterior enough to draw (mean, variance) jointly and gives the variance-aware policies (Greedy, EpsilonGreedy) a reasonable variance estimate.
The prior is parameterised as a Normal-Gamma:
priorMean is the prior mean reward.
priorWeight is the pseudo-weight of the prior: how many "phantom observations" the prior counts as. Higher = stronger prior, slower to move.
priorSquaredDeviations is the prior sum of squared deviations; tightens the prior on the variance. Higher = stronger belief that variance is large.
The default prior is mildly informative; priorWeight = 0.02 washes out after a few observations.
For multiplicative rewards (revenue, latency, anything where noise is log-normal rather than additive Gaussian), use LogNormalArm instead.
Constructors
Types
Properties
Functions
Allocate a fresh per-arm accumulator already seeded with this arm's prior pseudo-counts.
Map a raw observation onto the scale the stat accumulates. Identity by default; LogNormalArm overrides with ln so the underlying stat tracks the log-reward and the Normal-Gamma posterior fits the log-normal generative model.
NormalArm
createStat
Allocate a fresh per-arm accumulator already seeded with this arm's prior pseudo-counts.
priorMean
priorSquaredDeviations
Prior sum of squared deviations; tightens the prior on the variance.
priorWeight
Pseudo-weight of the prior seed.