Bayesian hypothesis testing and decision theory



I've been doing a lot of learning at the new job. Not because people here are teaching me stuff, but more because I'm in a good position to spend a significant portion of my day learning about stuff that will help me do my job. (Which is great, and fun, and further reinforces what I know about myself by now—I'm a great self-directed learner and a very poor externally-directed learner.)

One of the things I've learned is that when it comes to statistics, I'm a Bayesian. And all the crap I learned about things like hypothesis testing and maximum likelihood estimation in my stats classes now seems horribly clunky and old-fashioned to me.

Let's take hypothesis testing as an example. In the classical/frequentist world, you pick an arbitrary "small enough" probability (aka 5%), find the sampling distribution of your statistic under your null hypothesis, and if it's below that threshold, say yea, else say nay.

Here are some things that are wrong/bad with that approach: the 5% threshold is completely arbitrary, the sampling distribution under the alternative hypothesis is not taken into consideration (i.e. you only care about type I errors), and you don't have any way to balance the cost of type I vs type II errors. (Never mind the fact that people ALWAYS just use t-tests and ignore the fact that their datapoints are not actually distributed Normally and with the same means and variances. That, at least, I can tell you how to fix.)

Compare this with the Bayesian decision theory version of hypothesis testing: you assign a cost to the two types of error, calculate the posterior probability under both conditions, based on the observations and incorporating any prior knowledge if you have it, calculate a threshold that minimizes your expected cost, and accept or reject based on that. Doesn't that just make more sense?

I highly recommend the book Bayesian Computation with R. (Although it doesn't actually talk about decision theory!) It has an associated blog: LearnBayes.

Other things to look at: William H. Jefferys's Stats 295 class materials (especially these slides, which I'm still working my way through), and his blog for the class.

0 comments:

Blog Archive