Brief History of Probability

Here is a brief history of probability from a Bayesian perspective.  The word "probability" is derived from the word "probity".  Prior to the 1600's, legal evidence had greater weight when it had probity, which was a measure of authority.  A person with more authority had better evidence.  Today, probability may be loosely defined as the chance that an event occurs.

Probability theory began with two French mathematicians, Pierre de Fermat ~1601-1665) and Blaise Pascal (1623-1662), in 1654 regarding a question of profitability while gambling in a popular dice game.  An exchange of letters between Fermat and Pascal contained the first fundamental principles of probability.  The Dutch scientist Christian Huygens(1629-1695), a teacher of Leibniz (1646-1716), learned of this correspondence and published the first book on probability in 1657.

Thomas Bayes (1702-1761) introduced what is now known as Bayes' Theorem, which was published after his death in 1764.  Pierre Simon Laplace (1749-1827) introduced a general version of the theorem, which came to be considered inverse probability, inverse because it estimates the parameter from the results, the cause from the effect.  Bayesian inference was widely used and taught in the 1800's, until it was attacked by Fisher (1890-1962) and Jerzy Neyman (1894-1981).

Debates arose with probability regarding objectivity vs. subjectivity.  In the early 1920's, John Maynard Keynes (1883-1946) proposed the idea that probability should be interpreted as a subjective degree of belief in a proposition.  The earlier approach of Laplace became considered objectivist.  The subjectivists tackled problems with the frequentist definition of probability.

Harold Jeffreys (1891-1989) published his Theory of Probability in 1939, and is credited with the beginning of the revival of Bayesian inference.  During World War II, Alan Turing (1912-1954) invented a Bayesian codebreaking technique termed Banburismus, to assist in decoding the Nazi Enigma machine.  It was an early form of Bayesian networks used to infer information about the settings of the Enigma machine.

Richard T. Cox (1898-1991) demonstrated in 1946 that the rules of Bayesian inference have a well-formulated axiomatic basis, unlike frequentist inference, and may be derived from a simple set of desiderata.  He showed that Bayesian inference is the only inferential approach that is logically consistent.  In the 1950's, Leonard Jimmie Savage (1917-1971) further popularized subjective probability.

Taking a step back to 1906, Andrej Markov (1856-1922) introduced chains. Stanislaw Ulam (1909-1984) and John von Neumann (1903-1957) developed Monte Carlo with reference to random numbers for solving numerical problems.  The first publication appeared in the journal of the ASA in 1949, co-writen by Nicholas Metropolis (1915-1999) .  Metropolis and others introduced Markov chain Monte Carlo (MCMC) to the Journal of Chemical Physics in 1953.  It was later adopted by statisticians, appearing in Biometrika in 1970.  The name MCMC gained popularity somewhere around 1990.  MCMC is a class of algorithms for sampling from probability distributions.

Since the 1980's, Bayesian inference has gained popularity with the increase in computer speed, memory, and the ability to utilize MCMC.