What is a probability?

Epistemic Status Reasonably confident. Limits of knowability are discussed throughout, and research cited.

I have a friend who maintains that probabilities are a Stoppardian conspiracy of mathematicians. The chance that any given event will occur, she says, is fifty-fifty. Lottery victories? Asteroid impacts? Spontaneous combustion? All (apparently) decided on a metaphorical coin toss.

This line of reasoning doesn’t mesh with any of the common conceptions of probability, but that doesn’t immediately render it flawed, and I can sympathise with the joke’s premise. There is something deeply unintuitive and forced about thinking probabilistically: trying to quantify the instantaneous implosion of numerous possibilities into singular fact. When learning the maths behind statistics or random processes, we are more-or-less cornered into accepting a pragmatic definition. So long as the methods are useful in making predictions or understanding data, it’s best not to worry too much about whether they correspond to how uncertainty works in the real world.

Mathematics may be defined as the subject in which we never know what we are talking about, nor whether what we are saying is true.

Bertrand Russell, Mysticism and Logic (1910)

Does it really matter how uncertainty works in the real world? Not really. A remarkable feature of the maths of probability is that it is apathetic to your definition thereof: it is useful despite us not really knowing what we are talking about. But the mechanics of uncertainty do matter if we care about sating metaphysical curiosity which I, for one, do.

Set aside for a moment the vast philosophical literature on conceptions of probability, and even the notion of numerical probabilities altogether. Probabilities are stand-ins for the more nebulous concept of uncertainty which, as I see it, can have two components.

The first component is uncertainty arising in human minds due to a lack of knowledge about the state of the reality. This locates uncertainty ‘in our heads’ or—more precisely—in the discrepancy between reality and our brains’ mental model thereof.

The second component is fundamental, real indeterminism. It is debatable whether or not this component exists. If not, then reality is a clockwork, with the current positions and momenta of subatomic particles fully determining the future. If it does exist then there are random number generators built into the fabric of the universe, with the future to be determined by the randomness spat out along the way. A prima facie reading of modern quantum mechanics suggests that this is the case. Some uncertainty would thus originate ‘in the real world’, in the laws of nature.

I am not the first to find this dichotomy natural. The two components are less-colloquially known as ‘epistemic’ uncertainty and ‘ontic’ uncertainty, respectively. Characteristics, proponents and detractors are summarised in the following table.

This two-way classification deals with a different aspect of the topic than the standard roll-call of probability interpretations^[1]. The Classical, Logical, Subjective, Frequency, Propensity & Best-System conceptions all dance around the issue of where the uncertainty comes from, instead trying to articulate an interpretation of what numerical probabilities (eg. \(30%\) chance of rain in Coburg on Thursday) mean in a manner that will work for all probabilistic statements. This is an unattainable goal because—while superficially similar—numerical probabilities are used in fundamentally different ways in different contexts. Casinos will do fine with a mostly classical interpretation, pundits use subjective probabilities, and statistians aim to find best-system models. All interpretations (particularly frequency and propensity) gloss over the underlying source of the uncertainty, which numerical probability can only ever quantify imperfectly.

In this essay I argue that epistemic uncertainty is unavoidable, contemplate the existence of ontic uncertainty, and discuss some limitations of numerical probability.

Epistemic Uncertainty

That humans (both individually and collectively) can never have perfect knowledge of the world may seem patently true, but there is value in documenting the reasons why this is the case. The first is that there are regions of spacetime about which it is impossible to know anything, either due to the laws of physics or the limits of human observation and communication capabilities. Consider the following diagram.

This is a stylised map of everything and everywhere^[2]. You are at the origin, with the past on the left, the future on the right and physical distance on the vertical axis. The slope of the diagonal boundaries is the speed of information transition throughout the universe. (Simplistically drawn here as if it were constant, but of course reality is more complex.) This speed can be no larger than the speed of light, and in reality is much slower due to difficulties in the logistics of communication and data-collection.

The future cannot be known with certainty, but the consequence of the upper limit is that some of the past (the parts that fall in regions Y and Z) also cannot be known with certainty; the information originating there has not yet had time to reach us. This is most blatant in the domain of astrophysics, where the time taken for light to travel between points of interest is non-negligble, but the concept is applicable to all information. Supernovae, political misappropriation, and hiking-sock leeches share the property that we find out about them after the fact. When we get to f it will become possible to know what happened in the chevron Y, but there will be a new set of historical unknowables.

You might argue that we can, in principle, extrapolate our current, local models of reality into unknown regions of spacetime. Even if we ignore Hume’s problem of induction, we run up against computational intractability fairly quickly when trying to do this. Many of the domains we are interested in are chaotic systems, so small errors in our specification of the initial conditions would cause large flaws in our predictions. It is for this reason that we cannot predict weather more than a few weeks in advance, or human behaviour with any meaningful detail. One needs a universe to compute a universe.

And even for events that are theoretically knowable, there are fundamental limits to our knowledge of them, because the data upon which we are basing our inference will be unavoidably incomplete. All measuring equipment, including biological sensory perception, has upper limits on the precision it is capable of. The human eye, for example, is known to be limited in its ability to distinguish in colour, position and simultaneity. Similarly, digital detectors can record only discrete approximations of continuous phenomena. The precision of devices can be increased through improved design, but there will always be some margin of error.

All these arguments point to a universal insufficiency of human knowledge. When you get down in the weeds, we can’t be completely, epistemically certain of anything.

Ontic Uncertainty

Assume, temporarily, that quantum mechanics describes reality accurately, and that when not being measured particles are not particles but probability density functions describing their likely physical properties (position, momentum etcetera). The Heisenburg Uncertainty Principle states that the uncertainty in these physical properties cannot all be negligible at the same time. When the most likely superimposed positions of a particle converge, the spread of likely superimposed momenta diverges; there is a fundamental limit to how precisely position and momentum can be specified at the same time. When we measure a property of the particle, we get a single observation from the current probability distribution of that property.

Is it possible for randomness in the position and momentum of particles to ‘trickle up’ into randomness in macroscopic, everyday phenomena that humans are interested in? I asked someone who’d know, and

The short answer is no. Quantum effects such as superposition and indeterminacy wash away as the size of the system increases, from atoms to molecules to footballs; it also is compromised by the interactions of the system with the environment.

Marcelo Gleiser

This phenomena is known as decoherence. When you interact with a system, your perception of its macroscopic properties constrains the set of microscopic configurations that are possible. This amounts to an indirect observation of the particles involved, collapsing their probabilistic wavefunction. Somewhere between micro and macro there is a soft threshold where quantum indeterminacy goes from playing a role, to not.

Are there any ways in which microscopic ontic uncertainty could survive or bypass decoherence and be magnified into macroscopic indeterminism? One possible magnifier might be human brains, which at least one scientist-of-repute^[3] has conjectured to rely on quantum effects. Others^[4] have countered that the timescale over which particles would decohere in the brain (\(10^{-13}\sim 10^{-20}\) seconds) is much shorter than the timescales at which known neuronal processes operate (\(10^{-3}\sim 10^{-1}\) seconds), and so is unlikely to impact upon them.

Efforts to use quantum mechanics and its weirdness to explain consciousness or even lower brain functions must deal with the fact that the brain is a warm and wet environment, and thus a hard place to maintain any kind of quantum entanglement.

Marcelo Gleiser^[5]

At present, neither view is conclusive. But there exists another possible means of magnification, because the fact that we know about quantum indeterminacy demonstrates that the phenomenon has in some way influenced the human-scale world. At some point, truly random numbers popped up on a physicist’s screen, representing a measurement of a particle. In a chaotic system such as the human mind—even if it is completely deterministic—a small change in visual perception (a different number) might cause a significant change in behaviour down the track. In this way, randomness might be introduced into the course of history. That said, I find it hard to believe the universe stopped being deterministic when humans conducted the first double-slit experiment.

Of course, to take a step back, quantum mechanics is just a model that describes our observations to date, and our ability to observe elementary particles is imperfect. If you squint at any system from far enough away, its behaviour may appear random, because the few observable features might be determined by some as-yet unobservable mechanism^[6]. This point is not an emphatic denial of quantum mechanics, à la Einstein’s quip that “God does not play dice with the universe”^[7]. Rather it is a plea to not let quantum theory’s coziness with uncertainty discourage the search for unseen causes.

Quantum physicists have only probability laws because for two generations we have been indoctrinated not to believe in causes – and so we have stopped looking for them. Indeed, any attempt to search for the causes of microphenomena is met with scorn and a charge of professional incompetence and ‘obsolete mechanistic materialism’. Therefore, to explain the indeterminacy in current quantum theory we need not suppose there is any indeterminacy in Nature; the mental attitude of quantum physicists is already sufficient to guarantee it. … [Present quantum theory] cannot, as a matter of principle, answer any question of the form: ‘What is really happening when …?’ [This mathematical formalism], like Orwellian newspeak, does not even provide the vocabulary in which one could ask such a question.

Edwin Thompson Jaynes, Probability Theory: The Logic of Science (2003)

Further, it is possible to view the probabilistic laws of quantum mechanics as simply encapsulating our current epistemic uncertainty.

In current quantum theory, probabilities express our own ignorance due to our failure to search for the real causes of physical phenomena; and, worse, our failure even to think seriously about the problem. This ignorance may be unavoidable in practice, but in our present state of knowledge we do not know whether it is unavoidable in principle; the ‘central dogma’ simply asserts this, and draws the conclusion that belief in causes, and searching for them, is philosophically naïve. If everybody accepted this and abided by it, no further advances in understanding physical law would ever be made.

Edwin Thompson Jaynes, Probability Theory: The Logic of Science (2003)

In short, if you accept that epistemic uncertainty is unavoidable, then we can never have certainty about the existence of ontic uncertainty. There is a strange interplay between the two:

Epistemic is the gap between knowledge and reality
But Ontic says reality is not well-defined
So neither is the gap, or Ontic. Oh, [profanity]!
Whether the gap’s defined or not is uncertain, in kind.

This leads us to the limitations of numerical probabilities. If the gap between knowledge and reality might not be well-defined, how can we quantify it? The standard approach is to simply assume that reality is accurately described by a model we define in the mathematics of probability. While often useful, this is merely an assumption, a point clearly demonstrated by the Bertrand Paradox^[8]. Consider a circle containing an equilateral triangle.

Choose a chord at random. What is the probability that the chord is longer than the sides of the triangle?

Many possible answers can be justified, depending on the method by which you formalise the random chord-choice. Here are three possible methods, each giving a different solution.

Pick two points on the circumference of the circle uniformly at random, and join them to create a chord. Align one of the triangle’s vertices with the first point chosen. For the chord to be longer than a triangle side, the second point chosen needs to be between the opposite two vertices, on a segment of the circumference that is one-third of the total. The point is chosen uniformly, so the probability in this case is \(1/3\).
Pick a radius uniformly at random, then pick a midpoint for a chord uniformly at random along that radius. (A chord is uniquely specified by its midpoint.) Midpoints on the peripheral half of the radius will produce a chord shorter than the triangle side; those on the central half of the radius will produce a chord longer than the triangle side. Thus, the probability in this case is \(1/2\).
Pick a midpoint uniformly at random. Chords will be longer than the side of the triangle if their midpoint falls within a circle with half the radius of the original circle. The area of this circle is one-quarter the area of the original circle, so the probability in this case is \(1/4\).

The paradox only exists if you think there is a single true, numerical value for the probability, independent of the mechanism by which the chord is generated. In practice, we always need to impose a mathematical model of some form or another on the randomness we observe in order to quantify it in a well-defined way. We can choose models that seem to fit better than others, but can never claim our chosen model to be true, as that would contradict the ever-presence of epistemic uncertainty. The map is not the territory, and all that.

The choice of model, then, is necessarily subjective. This subjectivity is compounded by the ‘soft’ nature of probabilistic predictions, which (unless \(0\) or \(1\)) leave room for any outcome to occur. In practice, this makes it impossible to categorically rule out models that do a poor job of explaining reality, preventing us from narrowing the pool of candidates. A contrived example: consider the following three probabilistic predictions for a set of six events.

Model A	Model B	Model C
\(1/2\)	\(1/3\)	\(0\)
\(1/2\)	\(1/3\)	\(1\)
\(1/2\)	\(1/3\)	\(0\)
\(1/2\)	\(2/3\)	\(1\)
\(1/2\)	\(2/3\)	\(0\)
\(1/2\)	\(2/3\)	\(1\)

The observed outcomes were possible under—and so are compatible with—all three probability assignments. There are thus multiple models that could describe the phenomena in question^[9]. To distinguish between them we might consider additional criteria, such as how likely the model makes the observed outcomes, or how well the model generalises to new data. But we cannot definitively say which model is the best description of reality, without making further normative choices about how to define “best”. Though this point holds for any finite set of probabilities^[10], it is most commonly known as the ‘problem of the single case’. When an event only happens once, whose to say that it happened with probability \(1\) or probability \(1/1000\)? Improbable events happen all the time.

So what is a probability? A subjective, imperfect quantification of uncertainty, which is predominantly epistemic and a little bit ontic. But don’t take my word for it, I’m probably wrong.

Hájek, Alan, “Interpretations of Probability”, The Stanford Encyclopedia of Philosophy (Winter 2012 Edition), Edward N. Zalta (ed.), URL = <https://plato.stanford.edu/archives/win2012/entries/probability-interpret/>. ↩︎
Inspired by a similar diagram that appears in the final quarter of A Curious Incident of the Dog in the Nighttime by Mark Haddon. ↩︎
Sir Roger Penrose, a collaborator of Stephen Hawking, is the one I have in mind. A good summary of his quantum consciousness theory, and the reception to it, is given in this Nautilus article. ↩︎
See Max Tegmark’s paper The Importance of Quantum Decoherence in Brain Processes. ↩︎
I highly recommend Marcelo Gleiser’s book The Island of Knowledge: The Limits of Science and the Search for Meaning, from which this quote was taken. In particular, there is a very helpful Socratic dialogue on the fundamentals of quantum mechanics in Chapter 23. ↩︎
More spritually: particles are to humans as zero is to infinity. See Ouspensky, 1949. ↩︎
See Wikiquote. ↩︎
The Wikipedia article explains things further. ↩︎
Sometimes referred to as the Rashomon Effect, because the film entitled Rashomon presents the same event from different perspectives. ↩︎
One definition for the true probability of an event is the limit of the proportion of times the event occurs as the number of ‘trials’ approaches infinity. But all the events to which we assign probabilities only occur a finite number of times – or if they occur infinitely often, can only be observed finitely many times. (If we describe an event well enough, such the ‘Australian federal election, 1972’, it only occurs once.) So we can never be certain of the veracity of probabilistic models; there will always be multiple candidates that are compatible with reality. ↩︎

Thank you to Marcelo Gleiser, Harry Power, Nic Roumeliotis, Edmund Lau Tiew Hong & Yao-ban Chan for their correspondence and conversations on this topic.