Identifiability

It is generally possible to solve more complicated problems with Bayesian inference than frequentist inference. One such example is the identifiability of each parameter in a statistical model. Identifiability can be considered to be the ease the statistical method has with solving for each parameter. Problems with solving for the parameter may be problems with identifiability, meaning the statistical method is having a difficult time identifying the solution.

Suppose your goal is to model the relationship between public news events and the closing price of a particular stock. Also, suppose this particular statistical model is complicated. If the parameter for public news events is solved with frequentist inference, then an algorithm is most likely to be used that searches for the single point with the highest likelihood of a relationship between public news events and the closing price of the stock. At each iteration as the algorithm searches, it obtains an estimate of the likelihood at a particular point in a range for the parameter, and also the slope of that likelihood. At each particular point, if the slope is negative, then it will move to the left in the next iteration, and if the slope is positive, then it will move to the right in the next iteration. When the slope becomes essentially flat, the algorithm is considered to have arrived at a maximum value or likelihood.

Problems occur when the frequentist algorithm explores an area of the likelihood that is essentially flat. In this case, the algorithm is likely to get stuck, being unable to mathematically decide on a direction to go, or worse, achieving convergence when it should not. It is also possible that the likelihood, across the range of the possible parameter values, is a complicated function, having several highs and lows. The frequentist goal should be to arrive at the "maximum" likelihood, but it is possible that the algorithm finds a "local" maxima, rather than the "global" maxima.

Bayesian inference, on the other hand, most commonly uses Markov chain Monte Carlo (MCMC) to explore values for a parameter. Instead of searching for a maximum value, MCMC explores what is called the target distribution, and reports a probability distribution. Although MCMC also uses an algorithm to explore the target distribution, and it can also get (temporarily) stuck in undesirable areas, Bayesian inference also includes prior probabilities. When the prior probability distribution for a parameter is essentially flat, identifiability can also be a problem. However, the statistician can alter the prior probability for problematic target distributions, creating a slight slope in areas that would otherwise be flat. By altering the prior probabilities, the statistician facilitates the exploration of the target distribution, though the MCMC algorithm will continue to explore on its own, possibly with a random walk, though this depends on the specific MCMC algorithm, of which there are many.

The Laplace Approximation is an alternative to MCMC that is gaining popularity. As a Bayesian numerical approximation method, it also incorporates prior probabilities and can successfully navigate complicated likelihoods. Handled correctly, identifiability is not as problematic for Bayesian inference, and it is not a permanent barrier to more complicated models, like it is in frequentist inference.

While frequentist inference may provide a single point estimate of the optimal parametric value for public news events as they relate to the closing price of the stock, assuming the frequentist algorithm doesn't get stuck or converge on the wrong maxima, Bayesian inference with MCMC provides a probability distribution of the likelihood of parametric values across the range of the converged target distribution. Or, Bayesian inference with Laplace Approximations estimates moments of the posterior distributions, and samples may be taken from these distributions. Obviously, the statistician receives much more information about their parameter of interest with Bayesian inference, rather than frequentist inference.

Neither Bayesian inference nor frequentist inference guarantees the algorithm converges upon the optimal target distribution or point estimate, respectively, merely by estimating convergence. However, theory states that MCMC is guaranteed to arrive at the target distribution with infinite iterations of the algorithm, while the Laplace Approximation and frequentist algorithms have no such theoretic guarantee. Although Laplace Approximation does not offer this theoretic guarantee, it is better than the frequentist algorithms for other reasons, such as by including prior probabilities. Therefore, Bayesian inference is preferred to frequentist inference regarding model identifiability.