Bayesian Dark Knowledge
| Authors |
|
|---|---|
| Publication date | 2015 |
| Host editors |
|
| Book title | 29th Annual Conference on Neural Information Processing Systems 2015 |
| Book subtitle | Montreal, Canada, 7-12 December 2015 |
| ISBN |
|
| Series | Advances in Neural Information Processing Systems |
| Event | Neural Information Processing Systems (NIPS2015) |
| Volume | Issue number | 4 |
| Pages (from-to) | 3438-3446 |
| Publisher | Red Hook, NY: Curran Associates |
| Organisations |
|
| Abstract |
We consider the problem of Bayesian parameter estimation for deep neural networks, which is important in problem settings where we may have little data, and/ or where we need accurate posterior predictive densities p(y|x, D), e.g., for applications involving bandits or active learning. One simple approach to this is to use online Monte Carlo methods, such as SGLD (stochastic gradient Langevin dynamics). Unfortunately, such a method needs to store many copies of the parameters (which wastes memory), and needs to make predictions using many versions of the model (which wastes time).We describe a method for "distilling" a Monte Carlo approximation to the posterior predictive density into a more compact form, namely a single deep neural network. We compare to two very recent approaches to Bayesian neural networks, namely an approach based on expectation propagation [HLA15] and an approach based on variational Bayes [BCKW15]. Our method performs better than both of these, is much simpler to implement, and uses less computation at test time.
|
| Document type | Conference contribution |
| Language | English |
| Published at | http://papers.nips.cc/paper/5965-bayesian-dark-knowledge |
| Downloads |
5965-bayesian-dark-knowledge
(Accepted author manuscript)
|
| Permalink to this page | |