kumulant

RegressionTree

class RegressionTree(splitCandidates: List<Split>, config: RegressionTreeConfig = RegressionTreeConfig(), concurrency: Concurrency = Concurrency.None, leafArmFactory: () -> SeriesStat<WeightedVarianceResult> = { VarianceStat(concurrency) }, randomSeed: Int = 0)(source)

Online VFDT-style decision tree partitioning context vectors. Each leaf carries a weighted-variance accumulator; audit leaves additionally track pos/neg sub-arms per candidate split and, every RegressionTreeConfig.splitPeriod observations, evaluate them against the Hoeffding bound to decide whether to convert themselves into a RegressionSplitNode.

Internal split nodes hold no live arm; subtree aggregates (rootSnapshot, the value fields on TreeSplitResult) are derived by combining descendants at snapshot/merge time. The hot update path therefore touches exactly one arm: the leaf the observation routes to. Under Concurrency.Strict this removes the root-arm serialization point that previously bottlenecked multi-threaded throughput.

Concurrency: leaf-arm updates run lock-free (the arms themselves honour concurrency). The split-conversion path; the only one that mutates tree structure; is serialised by a single per-tree lock, fired only every RegressionTreeConfig.splitPeriod observations per audit leaf. Pointer writes on the hot path are skipped when the child reference is unchanged, so the typical update is pure arm arithmetic.

Constructors

Link copied to clipboard
constructor(splitCandidates: List<Split>, config: RegressionTreeConfig = RegressionTreeConfig(), concurrency: Concurrency = Concurrency.None, leafArmFactory: () -> SeriesStat<WeightedVarianceResult> = { VarianceStat(concurrency) }, randomSeed: Int = 0)

Properties

Link copied to clipboard

Number of internal + leaf nodes currently in the tree.

Functions

Link copied to clipboard

Walk to the leaf row resolves to.

Link copied to clipboard
fun merge(other: RegressionTree)

Structurally merge other into this tree. other is consumed (its node references may be grafted into this tree) and must not be used afterwards.

Link copied to clipboard

Snapshot merge using only the immutable result. Falls through to the same rules as merge but the "other" side is a TreeNodeResult tree-of-results rather than a live RegressionTree.

Link copied to clipboard

Mean of the leaf row resolves to.

Link copied to clipboard
fun prettyPrint(indent: String = ""): String

Render the tree as nested if (split) { ... } else { ... } text.

Link copied to clipboard
fun reset()

Reset to a single fresh leaf.

Link copied to clipboard

Live root node, for snapshotting.

Link copied to clipboard

Aggregate snapshot at the root; derived by walking leaves and any RegressionSplitNode.carryover aggregates. Under concurrent updates with active growth, pointer races at split-time can leak observations into orphaned sub-arms; this walk is therefore best-effort and may drift by a few ULPs of the configured workload under contention. Single-threaded runs are exact.

Link copied to clipboard
fun update(row: VectorView, value: Double, weight: Double = 1.0)

Fold an observation into the tree, possibly growing it.

RegressionTree

constructor(splitCandidates: List<Split>, config: RegressionTreeConfig = RegressionTreeConfig(), concurrency: Concurrency = Concurrency.None, leafArmFactory: () -> SeriesStat<WeightedVarianceResult> = { VarianceStat(concurrency) }, randomSeed: Int = 0)(source)

findLeaf

Walk to the leaf row resolves to.

mergeSnapshot

Snapshot merge using only the immutable result. Falls through to the same rules as merge but the "other" side is a TreeNodeResult tree-of-results rather than a live RegressionTree.

merge

Structurally merge other into this tree. other is consumed (its node references may be grafted into this tree) and must not be used afterwards.

  • Same split predicate: recurse on both children; internal aggregates are derived from leaves, so no per-split merge step is needed.

  • Both leaves: merge arms directly.

  • Self leaf, other split: adopt other's structure wholesale and fold self's leaf aggregate into the adopted subtree's leftmost leaf.

  • Self split, other leaf or different splits: keep self's structure and fold other's aggregate (recursively combined if other is a split) into self's leftmost leaf.

The "leftmost leaf" rule preserves the merged total weight but biases the un-routable observations into a single bucket; an honest fallback when the structures don't align.

nodeCount

Number of internal + leaf nodes currently in the tree.

predict

Mean of the leaf row resolves to.

prettyPrint

fun prettyPrint(indent: String = ""): String(source)

Render the tree as nested if (split) { ... } else { ... } text.

reset

fun reset()(source)

Reset to a single fresh leaf.

rootNode

Live root node, for snapshotting.

rootSnapshot

Aggregate snapshot at the root; derived by walking leaves and any RegressionSplitNode.carryover aggregates. Under concurrent updates with active growth, pointer races at split-time can leak observations into orphaned sub-arms; this walk is therefore best-effort and may drift by a few ULPs of the configured workload under contention. Single-threaded runs are exact.

update

fun update(row: VectorView, value: Double, weight: Double = 1.0)(source)

Fold an observation into the tree, possibly growing it.