UCB1Normal
UCB1-Normal (Auer et al. 2002). Variance-aware UCB for Gaussian rewards; uses the sample variance derived from the MomentsResult snapshot to scale the confidence bound. Reach for it when rewards are roughly Gaussian and unbounded; UCB1's [0, 1] assumption doesn't hold.
Forces exploration until each arm has at least 8 * ln(K) pulls (K is the arm count), then switches to the variance-aware score mean + alpha * sqrt(16 * variance * ln(K - 1) / armSamples).
Properties
Functions
Hook called when a new arm joins the population. Lets stateful policies fold the new arm's snapshot into their global counters (UCB's total-samples, UCB1Normal's arm count). Default no-op.
Allocate a fresh per-arm accumulator from the arm spec. Default delegates to arm.createStat(); override only if the policy needs a non-standard variant.
Score an arm given its current snapshot. Higher scores are preferred by the bandit. step is the global update count (for time-dependent exploration schedules); rng is the bandit's shared com.eignex.kumulant.bandit.Bandit.random (consumed by sampling policies).
Hook called when an arm leaves the population. Inverse of addArm; lets stateful policies remove the departing arm's contribution from their global counters. Default no-op.
UCB1Normal
addArm
Hook called when a new arm joins the population. Lets stateful policies fold the new arm's snapshot into their global counters (UCB's total-samples, UCB1Normal's arm count). Default no-op.
alpha
arm
Per-arm cumulator spec; determines the prior pseudo-counts, value encoding, and result shape that evaluate consumes.
evaluate
Score an arm given its current snapshot. Higher scores are preferred by the bandit. step is the global update count (for time-dependent exploration schedules); rng is the bandit's shared com.eignex.kumulant.bandit.Bandit.random (consumed by sampling policies).
removeArm
Hook called when an arm leaves the population. Inverse of addArm; lets stateful policies remove the departing arm's contribution from their global counters. Default no-op.