Model Fit

Model fit should refer to how well a statistical model fits the data, though the directionality of this definition depends on the type of statistical inference. Frequentist model fit refers to how well the data fits the model, while Bayesian model fit actually refers to how well the model fits the data. The reason for this distinction is because frequentist inference considers the model estimates to be fixed, singular point estimates, and considers the data to be random, or have a random component in the sense that frequentist inference assumes long-run frequencies given the data. In contrast, Bayesian inference considers the model estimates to be randomly distributed or have probability distributions, and considers the data to be fixed. Bayesian inference is preferred and logically coherent, while frequentist inference has potential problems due to the different interpretation of probability.

Model fit is assessed with a variety of statistics. Assume that you are trying to fit a model to a data set, and that you are trying to predict whether or not each observation, each individual, has cancer. First, you may select a statistical methodology. For this data set, the appropriate choices may be logit, probit, or using a cumulative log-log link. To be thorough, one of each of these types of models is built. Frequentist inference does not generally allow the comparison of model fit statistics across different types of models. Bayesian inference allows the comparison of model fit statistics of different types of models, and the DIC (Deviance Information Criterion) is the most popular Bayesian model fit statistic.

Within each of these three types of models, both Bayesian inference and frequentist inference allow the comparison of model fit statistics, for example, with different combinations of predictor variables. However, if any of these models is complicated, in the sense of using a hierarchical or multilevel structure, then interpretation of frequentist model fit statistics becomes more complicated and possibly incomparable. In contrast, Bayesian inference still allows valid comparisons within such complicated models, because the DIC incorporates a component that estimates the effective number of parameters, regardless of the actual number of parameters, and therefore accounts for model complexity better than frequentist model fit statistics.

Bayesian inference is preferable to frequentist inference because Bayesian model fit statistics such as the DIC allow a better assessment of model fit within complicated models, as well as the comparison of multiple models using different methodologies.