kumulant

KnnContextualBandit

class KnnContextualBandit(val nbrArms: Int, val k: Int = 5, val maxHistoryPerArm: Int = 1024, val coldStartScore: Double = 1.0, val exploration: Double = 1.0, val distance: (VectorView, VectorView) -> Double = ::squaredL2, val random: Random = Random.Default) : ContextualBandit, PerArmBandit<KnnArmResult> , ContextualScorable(source)

Non-parametric contextual bandit: each arm keeps a bounded FIFO history of past (context, reward, weight) observations and is scored at choose time by the empirical mean reward over the k nearest historical contexts, plus an optional UCB-style bonus that decays with the arm's cumulative weight.

  • Distance: defaults to squared L2 between dense vectors; supply distance for custom metrics (Mahalanobis, cosine, kernelised).

  • History cap: each arm's reservoir is bounded to maxHistoryPerArm; when full, new observations overwrite the oldest.

  • Cold start: arms with fewer than k observations score coldStartScore + ucbBonus, so they are explored before more populous arms.

Per-arm state is a history rather than a sufficient statistic, so the PerArmBandit snapshot is a KnnArmResult (the bounded reservoir itself) rather than a scalar summary.

Use cases: contextual problems where reward is a smooth function of context but the functional form is unknown; small-to-medium feature dimensions where exact k-NN is affordable; settings where interpretable "similar past contexts" reasoning is valuable.

Arms: contextual with caller-defined feature dimension; nbrArms fixed at construction. Per-arm reservoir is bounded by maxHistoryPerArm.

Memory: O(nbrArms · maxHistoryPerArm · featureSize); bounded per-arm history of context copies plus parallel reward/weight arrays.

Choose: O(nbrArms · maxHistoryPerArm · (featureSize + k)); linear scan over each arm's history with a bounded top-k heap.

Update: O(featureSize); append context copy and roll the oldest entry off when capped.

Randomness: random is held for API uniformity but currently unused; choose is deterministic, breaking ties by lowest arm index.

Concurrency: not thread-safe; per-arm history mutable lists, the total-weight array, and the step counter are mutated without synchronisation. Serialise choose and update externally for multi-thread use.

Constructors

Link copied to clipboard
constructor(nbrArms: Int, k: Int = 5, maxHistoryPerArm: Int = 1024, coldStartScore: Double = 1.0, exploration: Double = 1.0, distance: (VectorView, VectorView) -> Double = ::squaredL2, random: Random = Random.Default)

Types

Link copied to clipboard
object Companion

Distance-function helpers.

Properties

Link copied to clipboard

Optimistic value assigned to arms with no history yet; drives initial exploration.

Link copied to clipboard

Pairwise distance between context vectors; defaults to squared L2.

Link copied to clipboard

UCB-style exploration scale on sqrt(ln(totalSteps) / armWeight); 0.0 disables.

Link copied to clipboard
val k: Int

Neighbourhood size used for scoring; capped per-arm by the available history.

Link copied to clipboard

Maximum observations retained per arm; older entries roll off via FIFO.

Link copied to clipboard
open override val nbrArms: Int

Number of arms.

Link copied to clipboard
open override val random: Random

Single source of randomness; used only for tie-breaking, currently deterministic.

Functions

Link copied to clipboard
open fun armResult(armIndex: Int): KnnArmResult

Per-arm snapshot at armIndex. Default implementation reads from the full snapshot; implementations may override to avoid building the entire list when only one arm is needed.

Link copied to clipboard
fun armWeight(armIndex: Int): Double

Cumulative observation weight folded into arm armIndex.

Link copied to clipboard
open override fun choose(x: VectorView): Int

Argmax over per-arm evaluate scores. Ties broken by lowest index.

Link copied to clipboard
open override fun create(random: Random): KnnContextualBandit

Spawn a fresh bandit with the same configuration; history resets to empty.

Link copied to clipboard
open override fun evaluate(armIndex: Int, x: VectorView): Double

Score arm armIndex at context x: k-NN mean reward + UCB bonus.

Link copied to clipboard
fun historySize(armIndex: Int): Int

Live arm history size for armIndex.

Link copied to clipboard
open override fun merge(other: List<KnnArmResult>)

Fold another replica's other state into this bandit. Most families merge exactly via the underlying stat's parallel-merge formula; SGD- based contextual bandits merge approximately. Each concrete bandit's KDoc documents its merge semantics.

Link copied to clipboard
open override fun reset()

Clear every arm's history.

Link copied to clipboard
open override fun snapshot(): List<KnnArmResult>

Materialise the current state as a serialisable snapshot. Reads are non-mutating; call as often as needed without affecting decisions. Same snapshot consistency rules as com.eignex.kumulant.core.Stat.read ; under com.eignex.kumulant.core.Concurrency.Relaxed coupled cells may drift by ULPs.

Link copied to clipboard
open override fun update(armIndex: Int, x: VectorView, reward: Double, weight: Double = 1.0)

Append (x, reward, weight) to arm armIndex's history; oldest entry drops if full.

KnnContextualBandit

constructor(nbrArms: Int, k: Int = 5, maxHistoryPerArm: Int = 1024, coldStartScore: Double = 1.0, exploration: Double = 1.0, distance: (VectorView, VectorView) -> Double = ::squaredL2, random: Random = Random.Default)(source)

armWeight

fun armWeight(armIndex: Int): Double(source)

Cumulative observation weight folded into arm armIndex.

choose

open override fun choose(x: VectorView): Int(source)

Argmax over per-arm evaluate scores. Ties broken by lowest index.

coldStartScore

Optimistic value assigned to arms with no history yet; drives initial exploration.

create

open override fun create(random: Random): KnnContextualBandit(source)

Spawn a fresh bandit with the same configuration; history resets to empty.

distance

Pairwise distance between context vectors; defaults to squared L2.

evaluate

open override fun evaluate(armIndex: Int, x: VectorView): Double(source)

Score arm armIndex at context x: k-NN mean reward + UCB bonus.

exploration

UCB-style exploration scale on sqrt(ln(totalSteps) / armWeight); 0.0 disables.

historySize

fun historySize(armIndex: Int): Int(source)

Live arm history size for armIndex.

k

val k: Int(source)

Neighbourhood size used for scoring; capped per-arm by the available history.

maxHistoryPerArm

Maximum observations retained per arm; older entries roll off via FIFO.

merge

open override fun merge(other: List<KnnArmResult>)(source)

Fold another replica's other state into this bandit. Most families merge exactly via the underlying stat's parallel-merge formula; SGD- based contextual bandits merge approximately. Each concrete bandit's KDoc documents its merge semantics.

nbrArms

open override val nbrArms: Int(source)

Number of arms.

random

open override val random: Random(source)

Single source of randomness; used only for tie-breaking, currently deterministic.

reset

open override fun reset()(source)

Clear every arm's history.

snapshot

open override fun snapshot(): List<KnnArmResult>(source)

Materialise the current state as a serialisable snapshot. Reads are non-mutating; call as often as needed without affecting decisions. Same snapshot consistency rules as com.eignex.kumulant.core.Stat.read ; under com.eignex.kumulant.core.Concurrency.Relaxed coupled cells may drift by ULPs.

update

open override fun update(armIndex: Int, x: VectorView, reward: Double, weight: Double = 1.0)(source)

Append (x, reward, weight) to arm armIndex's history; oldest entry drops if full.