by Martin Larsson (Carnegie Mellon University), Aaditya Ramdas (Carnegie Mellon University), and Johannes Ruf (London School of Economics)
Under what conditions do optimal bets against a given probabilistic hypothesis exist? Answer: they always do!
John Kelly Jr., a researcher at Bell Labs, asked a simple question in 1954 [1]. Suppose we are invited to make bets on successive coin tosses at even odds. Suppose also that the coin is not fair, and we know that the probability of heads equals some value Q (eg: 0.7). Kelly asked: what gambling strategy will maximize our wealth in the long run?
Let us briefly elaborate on the game rules. Before each toss, we may bet some money on whether it will land heads or tails. If we are right, we earn double the amount that we bet, else we lose the amount that we bet. We start the game with one dollar (without loss of generality), and importantly, we can never bet more than we currently have.
Kelly did not use the language of hypothesis testing, but the connection to testing is simple. One can interpret the null hypothesis as being that of a fair coin, the alternative hypothesis as being that of a coin with bias Q. All allowed bets (double-or-nothing bets of different amounts) would not be expected to make money under the null, but they may under the alternative.
Kelly realized that if one bets smartly, the wealth can be made to grow exponentially fast in the number of rounds, and suggested maximizing that exponent. This is equivalent to maximizing the expected logarithmic wealth (i.e. the growth rate), now aptly called the Kelly criterion. A simple calculation shows that this log-optimal strategy corresponds to betting a 2Q-1 fraction of your current wealth on heads in every round (equivalently, betting a fraction Q on heads and 1-Q on tails, a form of hedging).
This strategy is intuitive: if Q equals half, there is no point betting, and if Q equals one, you should go all-in in every round, and the optimal strategy linearly interpolates between these extremes. Kelly also proved that the optimal exponent (rate of wealth growth) equals the Kullback-Leibler divergence, or relative entropy, between a coin with bias Q and a fair coin.
This justified the title of his paper, which linked gambling to what was then the newly developed information theory of Claude Shannon. In the following decades, Kelly’s ideas were generalized by Leo Breiman to settings with more outcomes and varied odds, by Krizhevsky and Trofimov to handle unknown Q, and by Thomas Cover in his famous work on universal portfolios.
Recently, we asked a natural question that fully generalized Kelly’s. Suppose we are constrained to make a bet that is fair under a general class of distributions P. Then, does there still exist a log-optimal bet (against some alternative Q)?
To elaborate, suppose we start with a dollar, and our wealth after one bet is called B. Then B must have two properties: it must be nonnegative (since we cannot lose more than that starting dollar), and its expected value under any distribution P in P must be at most one (we should not make money under the null hypothesis). Such a bet B is nowadays called an e-value.
Remarkably, with absolutely no assumptions on P, we showed that a log-optimal bet against Q always exists [2]. Moreover, this bet, that we called the numeraire, is Q-almost surely unique. There also exists a strong duality between the expected logarithm of the numeraire under Q, and the Kullback-Leibler divergence of Q to a special sub-distribution P*, which is the reverse information projection of Q onto P (the closest element of P to Q, in the information divergence). In fact, the numeraire is a likelihood ratio of Q to P*, the latter object lying in the effective null hypothesis of P (an enlargement of the convex hull).
Remarkably though, if one plays a repeated game of P against Q, the optimal gambling strategy does not play the above single-round numeraire bet repeatedly. Instead, we recently showed [3] that after betting the numeraire, one should then play the numeraire bet based on the next two observations observed together, then on the next four observations together, and so on. This ensures that the wealth eventually grows at the optimal rate (calculated over all possible games that one could construct) which is larger than the Kullback-Leibler divergence between Q and P*, but in general smaller than the infimum Kullback-Leibler divergence between Q and P.
When Q is unknown, one can still design betting wealth processes (called e-processes) that are asymptotically relatively growth rate optimal. For example, the latter authors showed that in a Polish space, if P is compact in the usual weak topology (a very weak assumption, not even requiring convexity), then one can design a single betting strategy that can obtain the asymptotically optimal growth rate for every Q in the complement of P. This in turn implies that there exist power-one sequential tests for weakly compact P against its complement, a remarkably general fact. While the authors of [2] have recently completely characterized when nontrivial fixed-sample tests exist for arbitrary P versus Q, such a complete characterization for sequential testing remains open at the time of writing this article.
Martin Larsson acknowledges support from NSF grant DMS-2510965.
References:
[1] J. L. Kelly Jr., “A new interpretation of information rate,” Bell System Technical Journal, vol. 35, no. 4, pp. 917–926, Jul. 1956.
[2] M. Larsson, A. Ramdas, and J. Ruf, “The numeraire e-variable and reverse information projection,” Annals of Statistics, vol. 53, no. 3, pp. 1467–1493, 2025.
[3] A. Ram and A. Ramdas, “The optimal betting wealth growth rate,” arXiv preprint arXiv:2604.25280, 2026.
Please contact:
Aaditya Ramdas, Carnegie Mellon University, USA

