**frequocentrist**. Yeah, I just made up that word. Let's pronounce it as "freak-quo-centrists. It refers to using frequentist criteria and standards to evaluate Bayesian arguments.

To show that frequocentric arguments are lacking, I am going to do the reverse here. I am going to evaluate p-values with a Bayescentric simulation.

I created a set of 40,000 replicate experiments of 10 observations each. Half of these sets were from the null model; half were from an alternative model with a true effect size of .4. Let's suppose you picked one of these 40,000 and asked if it were from the null model or from the effect model. If you ignore the observations entirely, then you would rightly think it is a 50-50 proposition. The question is how much do you gain from looking at the data.

Figure 1A shows the histograms of observed effect sizes for each model. The top histogram (salmon) is for the effect model; the bottom, downward going histogram (blue) is for the null model. I drew it downward to reduce clutter.

The arrows highlight the bin between .5 and .6. Suppose we had observed an effect size there. According to the simulation, 2,221 of the 20,000 replicates under the alternative model are in this bin. And 599 of the 20,000 replicates under the null model are in this bin. If we had observed an effect size in this bin, then the proportion of times it comes from the null model is 599/(2,221+599) = .21. So, with this observed effect size, the probability goes from 50-50 to 20-80. Figure 1B shows the proportion of replicates from the null model, and the dark point is for the highlighted bin. As a rule, the proportion of replicates from the null decreases with effect size.

We can see how well p-values match these probabilities. The dark red solid line is the one-tail p-values, and these are miscalibrated. They clearly overstate the evidence against the null and for an effect. Bayes factors, in contrast, get this problem exactly right---it is the problem they are designed to solve. The dashed lines show the probabilities derived from the Bayes factors, and they are spot on. Of course, we didn't need simulations to show this concordance. It falls directly from the law of conditional probability.

Some of you might find this demonstration unhelpful because it misses the point of what a p-value is what it does. I get it. It's exactly how I feel about others' simulations of Bayes factors.

This blog post is based on my recent PBR paper: Optional Stopping: No Problem for Bayesians. It shows that Bayes factors solves the problem they are designed to solve even in the presence of optional stopping.