defaultEta
Default learning rate from the EXP4 regret analysis: sqrt(ln(N) / (T * K)) collapsed to a horizon-free form using T = 1 as a starting heuristic.
Default learning rate from the EXP4 regret analysis: sqrt(ln(N) / (T * K)) collapsed to a horizon-free form using T = 1 as a starting heuristic.