kumulant

Scorable

interface Scorable(source)

Opt-in per-arm scoring for inspection / debugging / custom selectors. Bandits whose UnivariateBandit.choose is an argmax over independent per-arm scores expose this; UCB1, Thompson, epsilon-greedy, etc.

Joint-sampling bandits don't implement Scorable: their selection rule doesn't decompose into a per-arm score. Boltzmann samples from a softmax over arms (the score of any one arm depends on every other arm's score), Top-Two Thompson samples twice and picks conditionally, Exp3 samples from a weight distribution. For those, use the bandit's Snapshotable.snapshot to read the underlying state directly.

Inheritors

Functions

Link copied to clipboard
abstract fun evaluate(armIndex: Int): Double

Score the arm at armIndex under the bandit's current state. The value's interpretation is policy-specific; UCB upper bound, Thompson draw, mean estimate, etc.; and what the bandit's choose would compare against the other arms' scores.

evaluate

abstract fun evaluate(armIndex: Int): Double(source)

Score the arm at armIndex under the bandit's current state. The value's interpretation is policy-specific; UCB upper bound, Thompson draw, mean estimate, etc.; and what the bandit's choose would compare against the other arms' scores.