Privacy for Free: Posterior Sampling and Stochastic Gradient Monte Carlo

Yu-Xiang Wang, Stephen Fienberg, Alex Smola
Proceedings of the 32nd International Conference on Machine Learning, PMLR 37:2493-2502, 2015.

Abstract

We consider the problem of Bayesian learning on sensitive datasets and present two simple but somewhat surprising results that connect Bayesian learning to “differential privacy”, a cryptographic approach to protect individual-level privacy while permitting database-level utility. Specifically, we show that under standard assumptions, getting one sample from a posterior distribution is differentially private “for free”; and this sample as a statistical estimator is often consistent, near optimal, and computationally tractable. Similarly but separately, we show that a recent line of work that use stochastic gradient for Hybrid Monte Carlo (HMC) sampling also preserve differentially privacy with minor or no modifications of the algorithmic procedure at all, these observations lead to an “anytime” algorithm for Bayesian learning under privacy constraint. We demonstrate that it performs much better than the state-of-the-art differential private methods on synthetic and real datasets.

Cite this Paper


BibTeX
@InProceedings{pmlr-v37-wangg15, title = {Privacy for Free: Posterior Sampling and Stochastic Gradient Monte Carlo}, author = {Wang, Yu-Xiang and Fienberg, Stephen and Smola, Alex}, booktitle = {Proceedings of the 32nd International Conference on Machine Learning}, pages = {2493--2502}, year = {2015}, editor = {Bach, Francis and Blei, David}, volume = {37}, series = {Proceedings of Machine Learning Research}, address = {Lille, France}, month = {07--09 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v37/wangg15.pdf}, url = {https://proceedings.mlr.press/v37/wangg15.html}, abstract = {We consider the problem of Bayesian learning on sensitive datasets and present two simple but somewhat surprising results that connect Bayesian learning to “differential privacy”, a cryptographic approach to protect individual-level privacy while permitting database-level utility. Specifically, we show that under standard assumptions, getting one sample from a posterior distribution is differentially private “for free”; and this sample as a statistical estimator is often consistent, near optimal, and computationally tractable. Similarly but separately, we show that a recent line of work that use stochastic gradient for Hybrid Monte Carlo (HMC) sampling also preserve differentially privacy with minor or no modifications of the algorithmic procedure at all, these observations lead to an “anytime” algorithm for Bayesian learning under privacy constraint. We demonstrate that it performs much better than the state-of-the-art differential private methods on synthetic and real datasets.} }
Endnote
%0 Conference Paper %T Privacy for Free: Posterior Sampling and Stochastic Gradient Monte Carlo %A Yu-Xiang Wang %A Stephen Fienberg %A Alex Smola %B Proceedings of the 32nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2015 %E Francis Bach %E David Blei %F pmlr-v37-wangg15 %I PMLR %P 2493--2502 %U https://proceedings.mlr.press/v37/wangg15.html %V 37 %X We consider the problem of Bayesian learning on sensitive datasets and present two simple but somewhat surprising results that connect Bayesian learning to “differential privacy”, a cryptographic approach to protect individual-level privacy while permitting database-level utility. Specifically, we show that under standard assumptions, getting one sample from a posterior distribution is differentially private “for free”; and this sample as a statistical estimator is often consistent, near optimal, and computationally tractable. Similarly but separately, we show that a recent line of work that use stochastic gradient for Hybrid Monte Carlo (HMC) sampling also preserve differentially privacy with minor or no modifications of the algorithmic procedure at all, these observations lead to an “anytime” algorithm for Bayesian learning under privacy constraint. We demonstrate that it performs much better than the state-of-the-art differential private methods on synthetic and real datasets.
RIS
TY - CPAPER TI - Privacy for Free: Posterior Sampling and Stochastic Gradient Monte Carlo AU - Yu-Xiang Wang AU - Stephen Fienberg AU - Alex Smola BT - Proceedings of the 32nd International Conference on Machine Learning DA - 2015/06/01 ED - Francis Bach ED - David Blei ID - pmlr-v37-wangg15 PB - PMLR DP - Proceedings of Machine Learning Research VL - 37 SP - 2493 EP - 2502 L1 - http://proceedings.mlr.press/v37/wangg15.pdf UR - https://proceedings.mlr.press/v37/wangg15.html AB - We consider the problem of Bayesian learning on sensitive datasets and present two simple but somewhat surprising results that connect Bayesian learning to “differential privacy”, a cryptographic approach to protect individual-level privacy while permitting database-level utility. Specifically, we show that under standard assumptions, getting one sample from a posterior distribution is differentially private “for free”; and this sample as a statistical estimator is often consistent, near optimal, and computationally tractable. Similarly but separately, we show that a recent line of work that use stochastic gradient for Hybrid Monte Carlo (HMC) sampling also preserve differentially privacy with minor or no modifications of the algorithmic procedure at all, these observations lead to an “anytime” algorithm for Bayesian learning under privacy constraint. We demonstrate that it performs much better than the state-of-the-art differential private methods on synthetic and real datasets. ER -
APA
Wang, Y., Fienberg, S. & Smola, A.. (2015). Privacy for Free: Posterior Sampling and Stochastic Gradient Monte Carlo. Proceedings of the 32nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 37:2493-2502 Available from https://proceedings.mlr.press/v37/wangg15.html.

Related Material