Arm
Recipe for one bandit arm's cumulator side: how to build a freshly-seeded SeriesStat for that arm, and how to encode a raw observation before folding it into the stat. Posteriors and BanditPolicys pair with arm specs of the same R.
The split keeps each concern in one place:
Sufficient-statistic accumulation: kumulant's SeriesStat families.
Prior pseudo-counts: this spec's createStat seeds the accumulator so a posterior or UCB formula evaluated immediately at empty returns a well-defined finite score.
Value transformation: this spec's encode maps a raw observation onto the scale the stat accumulates. Identity for most arms; LogNormalArm overrides with
lnso multiplicative rewards are accumulated on a log scale.Posterior sampling: stateless Posterior; consumes the same snapshot the stat produces.
Sealed + @Serializable so an arm configuration round-trips on the wire alongside the rest of the UnivariateBanditSpec.
Picking an arm
BernoulliArm: binary reward; result is BernoulliSumResult; pairs with BetaPosterior (Thompson) or UCB1.
MeanArm: single-moment likelihoods (Poisson, Geometric, Exponential, GammaScale); result is WeightedMeanResult; pairs with one of the single-moment posteriors (PoissonGammaPosterior, GeometricBetaPosterior, ExponentialGammaPosterior, GammaScalePosterior).
NormalArm: Gaussian reward with unknown mean and variance; result is WeightedVarianceResult; pairs with NormalGammaPosterior (Thompson) or the mean-based policies (Greedy, EpsilonGreedy, EpsilonDecreasing, UniformSelection).
LogNormalArm: multiplicative reward (revenue, latency); result is WeightedVarianceResult on the log scale; pairs with LogNormalGammaPosterior.
MomentsArm: Gaussian with second-moment tracking; result is MomentsResult; pairs with the variance-aware UCB family (UCB1Normal, UCB1Tuned, UcbV).
Inheritors
Functions
Allocate a fresh per-arm accumulator already seeded with this arm's prior pseudo-counts.
Map a raw observation onto the scale the stat accumulates. Identity by default; LogNormalArm overrides with ln so the underlying stat tracks the log-reward and the Normal-Gamma posterior fits the log-normal generative model.
createStat
Allocate a fresh per-arm accumulator already seeded with this arm's prior pseudo-counts.
encode
Map a raw observation onto the scale the stat accumulates. Identity by default; LogNormalArm overrides with ln so the underlying stat tracks the log-reward and the Normal-Gamma posterior fits the log-normal generative model.