com.eignex.kumulant/bandit/contextual/KnnContextualBandit

KnnContextualBandit

class KnnContextualBandit(val nbrArms: Int, val k: Int = 5, val maxHistoryPerArm: Int = 1024, val coldStartScore: Double = 1.0, val exploration: Double = 1.0, val distance: (VectorView, VectorView) -> Double = ::squaredL2, val random: Random = Random.Default) : ContextualBandit, PerArmBandit<KnnArmResult> , ContextualScorable(source)

Non-parametric contextual bandit: each arm keeps a bounded FIFO history of past (context, reward, weight) observations and is scored at choose time by the empirical mean reward over the k nearest historical contexts, plus an optional UCB-style bonus that decays with the arm's cumulative weight.

Distance: defaults to squared L2 between dense vectors; supply distance for custom metrics (Mahalanobis, cosine, kernelised).
History cap: each arm's reservoir is bounded to maxHistoryPerArm; when full, new observations overwrite the oldest.
Cold start: arms with fewer than k observations score coldStartScore + ucbBonus, so they are explored before more populous arms.

Per-arm state is a history rather than a sufficient statistic, so the PerArmBandit snapshot is a KnnArmResult (the bounded reservoir itself) rather than a scalar summary.

Use cases: contextual problems where reward is a smooth function of context but the functional form is unknown; small-to-medium feature dimensions where exact k-NN is affordable; settings where interpretable "similar past contexts" reasoning is valuable.

Arms: contextual with caller-defined feature dimension; nbrArms fixed at construction. Per-arm reservoir is bounded by maxHistoryPerArm.

Memory: O(nbrArms · maxHistoryPerArm · featureSize); bounded per-arm history of context copies plus parallel reward/weight arrays.

Choose: O(nbrArms · maxHistoryPerArm · (featureSize + k)); linear scan over each arm's history with a bounded top-k heap.

Update: O(featureSize); append context copy and roll the oldest entry off when capped.

Randomness: random is held for API uniformity but currently unused; choose is deterministic, breaking ties by lowest arm index.

Concurrency: not thread-safe; per-arm history mutable lists, the total-weight array, and the step counter are mutated without synchronisation. Serialise choose and update externally for multi-thread use.

Constructors

KnnContextualBandit

constructor(nbrArms: Int, k: Int = 5, maxHistoryPerArm: Int = 1024, coldStartScore: Double = 1.0, exploration: Double = 1.0, distance: (VectorView, VectorView) -> Double = ::squaredL2, random: Random = Random.Default)(source)

Types

Companion

object Companion

Distance-function helpers.

Properties

coldStartScore

val coldStartScore: Double(source)

Optimistic value assigned to arms with no history yet; drives initial exploration.

distance

val distance: (VectorView, VectorView) -> Double(source)

Pairwise distance between context vectors; defaults to squared L2.

exploration

val exploration: Double(source)

UCB-style exploration scale on sqrt(ln(totalSteps) / armWeight); 0.0 disables.

k

val k: Int(source)

Neighbourhood size used for scoring; capped per-arm by the available history.

maxHistoryPerArm

val maxHistoryPerArm: Int(source)

Maximum observations retained per arm; older entries roll off via FIFO.

nbrArms

open override val nbrArms: Int(source)

Number of arms.

random

open override val random: Random(source)

Single source of randomness; used only for tie-breaking, currently deterministic.

Materialise the current state as a serialisable snapshot. Reads are non-mutating; call as often as needed without affecting decisions. Same snapshot consistency rules as com.eignex.kumulant.core.Stat.read ; under com.eignex.kumulant.core.Concurrency.Relaxed coupled cells may drift by ULPs.

update

open override fun update(armIndex: Int, x: VectorView, reward: Double, weight: Double = 1.0)(source)

Append (x, reward, weight) to arm armIndex's history; oldest entry drops if full.

armResult

open fun armResult(armIndex: Int): KnnArmResult

Per-arm snapshot at armIndex. Default implementation reads from the full snapshot; implementations may override to avoid building the entire list when only one arm is needed.

KnnContextualBandit

Constructors

KnnContextualBandit

Types

Properties

coldStartScore

distance

exploration

k

maxHistoryPerArm

nbrArms

random

Functions

armWeight

choose

create

evaluate

historySize

merge

reset

snapshot

update