Joachim Breitner's Homepage
Do surprises get larger?
The setup
Imagine you are living on a riverbank. Every now and then, the river swells and you have high water. The first few times this may come as a surprise, but soon you learn that such floods are a recurring occurrence at that river, and you make suitable preparation. Let’s say you feel well-prepared against any flood that is no higher than the highest one observed so far. The more floods you have seen, the higher that mark is, and the better prepared you are. But of course, eventually a higher flood will occur that surprises you.
Of course such new record floods are happening rarer and rarer as you have seen more of them. I was wondering though: By how much do the new records exceed the previous high mark? Does this excess decrease or increase over time?
A priori both could be. When the high mark is already rather high, maybe new record floods will just barley pass that mark? Or maybe, simply because new records are so rare events, when they do occur, they can be surprisingly bad?
This post is a leisurely mathematical investigating of this question, which of course isn’t restricted to high waters; it could be anything that produces a measurement repeatedly and (mostly) independently – weather events, sport results, dice rolls.
The answer of course depends on the distribution of results: How likely is each possible results.
Dice are simple
With dice rolls the answer is rather simple. Let our measurement be how often you can roll a die until it shows a 6. This simple game we can repeat many times, and keep track of our record. Let’s say the record happens to be 7 rolls. If in the next run we roll the die 7 times, and it still does not show a 6, then we know that we have broken the record, and every further roll increases by how much we beat the old record.
But note that how often we will now roll the die is completely independent of what happened before!
So for this game the answer is: The excess with which the record is broken is always the same.
Mathematically speaking this is because the distribution of “rolls until the die shows a 6” is memoryless. Such distributions are rather special, its essentially just the example we gave (a geometric distribution), or its continuous analogue (the exponential distributions, for example the time until a radioactive particle decays).
Mathematical formulation
With this out of the way, let us look at some other distributions, and for that, introduce some mathematical notations. Let X be a random variable with probability density function φ(x) and cumulative distribution function Φ(x), and a be the previous record. We are interested in the behavior of
Y(a) = X − a ∣ X > x
i.e. by how much X exceeds a under the condition that it did exceed a. How does Y change as a increases? In particular, how does the expected value of the excess e(a) = E(Y(a)) change?
Uniform distribution
If X is uniformly distributed between, say, 0 and 1, then a new record will appear uniformly distributed between a and 1, and as that range gets smaller, the excess must get smaller as well. More precisely,
e(a) = E(X − a ∣ X > a) = E(X ∣ X > a) − a = (1 − a)/2
This not very interesting linear line is plotted in blue in this diagram:
The orange line with the logarithmic scale on the right tries to convey how unlikely it is to surpass the record value a: it shows how many attempts we expect before the record is broken. This can be calculated by n(a) = 1/(1 − Φ(a)).
Normal distribution
For the normal distribution (with median 0 and standard derivation 1, to keep things simple), we can look up the expected value of the one-sided truncated normal distribution and obtain
e(a) = E(X ∣ X > a) − a = φ(a)/(1 − Φ(a)) − a
Now is this growing or shrinking? We can plot this an have a quick look:
Indeed it is, too, a decreasing function!
(As a sanity check we can see that e(0) = √(2/π), which is the expected value of the half-normal distribution, as it should.)
Could it be any different?
This settles my question: It seems that each new surprisingly high water will tend to be less surprising than the previously – assuming high waters were uniformly or normally distributed, which is unlikely to be helpful.
This does raise the question, though, if there are probability distributions for which e(a) is be increasing?
I can try to construct one, and because it’s a bit easier, I’ll consider a discrete distribution on the positive natural numbers, and consider at g(0) = E(X) and g(1) = E(X − 1 ∣ X > 1). What does it take for g(1) > g(0)? Using E(X) = p + (1 − p)E(X ∣ X > 1) for p = P(X = 1) we find that in order to have g(1) > g(0), we need E(X) > 1/p.
This is plausible because we get equality when E(X) = 1/p, as it precisely the case for the geometric distribution. And it is also plausible that it helps if p is large (so that the next first record is likely just 1) and if, nevertheless, E(X) is large (so that if we do get an outcome other than 1, it’s much larger).
Starting with the geometric distribution, where P(X > n ∣ X ≥ n) = pn = p (the probability of again not rolling a six) is constant, it seems that these pn is increasing, we get the desired behavior. So let p1 < p2 < pn < … be an increasing sequence of probabilities, and define X so that P(X = n) = p1 ⋅ ⋯ ⋅ pn − 1 ⋅ (1 − pn) (imagine the die wears off and the more often you roll it, the less likely it shows a 6). Then for this variation of the game, every new record tends to exceed the previous more than previous records. As the p increase, we get a flatter long end in the probability distribution.
Gamma distribution
To get a nice plot, I’ll take the intuition from this and turn to continuous distributions. The Wikipedia page for the exponential distribution says it is a special case of the gamma distribution, which has an additional shape parameter α, and it seems that it could influence the shape of the distribution to be and make the probability distribution have a longer end. Let’s play around with β = 2 and α = 0.5, 1 and 1.5:
For α = 1 (dotted) this should just be the exponential distribution, and we see that e(a) is flat, as predicted earlier.
For larger α (dashed) the graph does not look much different from the one for the normal distribution – not a surprise, as for α → ∞, the gamma distribution turns into the normal distribution.
For smaller α (solid) we get the desired effect: e(a) is increasing. This means that new records tend to break records more impressively.
The orange line shows that this comes at a cost: for a given old record a, new records are harder to come by with smaller α.
Conclusion
As usual, it all depends on the distribution. Otherwise, not much, it’s late.
Have something to say? You can post a comment by sending an e-Mail to me at <mail@joachim-breitner.de>, and I will include it here.