kumulant

MinHashResult

@Serializable
@SerialName(value = "MinHashResult")
data class MinHashResult(val numHashes: Int, val seed: Long, val signatures: LongArray, val totalSeen: Long) : Result(source)

MinHash signature snapshot. signatures is the per-hash running minimum of splitmix64(value xor splitmix64(seed + i)) over all updates; merging two snapshots takes element-wise min and requires identical numHashes and seed.

Constructors

Link copied to clipboard
constructor(numHashes: Int, seed: Long, signatures: LongArray, totalSeen: Long)

Properties

Link copied to clipboard

Number of independent hash functions in the signature.

Link copied to clipboard
val seed: Long

PRNG seed used to derive the per-hash salts; must match for merge / jaccard.

Link copied to clipboard

Per-hash running minimums; element-wise min produces a valid merge.

Link copied to clipboard

Number of update(value) calls absorbed; informational.

Functions

Link copied to clipboard

Estimated Jaccard similarity between the two underlying sets - the fraction of slots where signatures agree. Requires matching numHashes and seed.

MinHashResult

constructor(numHashes: Int, seed: Long, signatures: LongArray, totalSeen: Long)(source)

numHashes

Number of independent hash functions in the signature.

seed

PRNG seed used to derive the per-hash salts; must match for merge / jaccard.

signatures

Per-hash running minimums; element-wise min produces a valid merge.

totalSeen

Number of update(value) calls absorbed; informational.