Home Page




Editorial Board



Open Source Software



RSS Feed

JMLR Workshop and Conference Proceedings

Volume 48: Proceedings of The 33rd International Conference on Machine Learning

Editors: Maria Florina Balcan, Kilian Q. Weinberger


Accepted Papers

No Oops, You Won’t Do It Again: Mechanisms for Self-correction in Crowdsourcing

Nihar Shah, Dengyong Zhou

Stochastically Transitive Models for Pairwise Comparisons: Statistical and Computational Issues

Nihar Shah, Sivaraman Balakrishnan, Aditya Guntuboyina, Martin Wainwright

Uprooting and Rerooting Graphical Models

Adrian Weller

A Deep Learning Approach to Unsupervised Ensemble Learning

Uri Shaham, Xiuyuan Cheng, Omer Dror, Ariel Jaffe, Boaz Nadler, Joseph Chang, Yuval Kluger

Revisiting Semi-Supervised Learning with Graph Embeddings

Zhilin Yang, William Cohen, Ruslan Salakhudinov

Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization

Chelsea Finn, Sergey Levine, Pieter Abbeel

Diversity-Promoting Bayesian Learning of Latent Variable Models

Pengtao Xie, Jun Zhu, Eric Xing

Additive Approximations in High Dimensional Nonparametric Regression via the SALSA

Kirthevasan Kandasamy, Yaoliang Yu

Hawkes Processes with Stochastic Excitations

Young Lee, Kar Wai Lim, Cheng Soon Ong

Data-driven Rank Breaking for Efficient Rank Aggregation

Ashish Khetan, Sewoong Oh

Dropout distillation

Samuel Rota Bulò, Lorenzo Porzi, Peter Kontschieder

Metadata-conscious anonymous messaging

Giulia Fanti, Peter Kairouz, Sewoong Oh, Kannan Ramchandran, Pramod Viswanath

The Teaching Dimension of Linear Learners

Ji Liu, Xiaojin Zhu, Hrag Ohannessian

Truthful Univariate Estimators

Ioannis Caragiannis, Ariel Procaccia, Nisarg Shah

Why Regularized Auto-Encoders learn Sparse Representation?

Devansh Arpit, Yingbo Zhou, Hung Ngo, Venu Govindaraju

k-variates++: more pluses in the k-means++

Richard Nock, Raphael Canyasse, Roksana Boreli, Frank Nielsen

Multi-Player Bandits – a Musical Chairs Approach

Jonathan Rosenski, Ohad Shamir, Liran Szlak

The Information Sieve

Greg Ver Steeg, Aram Galstyan

Deep Speech 2 : End-to-End Speech Recognition in English and Mandarin

Dario Amodei, Rishita Anubhai, Eric Battenberg, Carl Case, Jared Casper, Bryan Catanzaro, JingDong Chen, Mike Chrzanowski, Adam Coates, Greg Diamos, Erich Elsen, Jesse Engel, Linxi Fan, Christopher Fougner, Awni Hannun, Billy Jun, Tony Han, Patrick LeGresley, Xiangang Li, Libby Lin, Sharan Narang, Andrew Ng, Sherjil Ozair, Ryan Prenger, Sheng Qian, Jonathan Raiman, Sanjeev Satheesh, David Seetapun, Shubho Sengupta, Chong Wang, Yi Wang, Zhiqian Wang, Bo Xiao, Yan Xie, Dani Yogatama, Jun Zhan, Zhenyao Zhu

On the Consistency of Feature Selection With Lasso for Non-linear Targets

Yue Zhang, Weihong Guo, Soumya Ray

Minimum Regret Search for Single- and Multi-Task Optimization

Jan Hendrik Metzen

CryptoNets: Applying Neural Networks to Encrypted Data with High Throughput and Accuracy

Ran Gilad-Bachrach, Nathan Dowlin, Kim Laine, Kristin Lauter, Michael Naehrig, John Wernsing

The Variational Nystrom method for large-scale spectral problems

Max Vladymyrov, Miguel Carreira-Perpinan

Multi-Bias Non-linear Activation in Deep Neural Networks

Hongyang Li, Wanli Ouyang, Xiaogang Wang

Asymmetric Multi-task Learning Based on Task Relatedness and Loss

Giwoong Lee, Eunho Yang, Sung ju Hwang

Accurate Robust and Efficient Error Estimation for Decision Trees

Lixin Fan

Fast Stochastic Algorithms for SVD and PCA: Convergence Properties and Convexity

Ohad Shamir

Convergence of Stochastic Gradient Descent for PCA

Ohad Shamir

Dealbreaker: A Nonlinear Latent Variable Model for Educational Data

Andrew Lan, Tom Goldstein, Richard Baraniuk, Christoph Studer

A Kernelized Stein Discrepancy for Goodness-of-fit Tests

Qiang Liu, Jason Lee, Michael Jordan

Variable Elimination in the Fourier Domain

Yexiang Xue, Stefano Ermon, Ronan Le Bras, Carla, Bart Selman

Low-Rank Matrix Approximation with Stability

Dongsheng Li, Chao Chen, Qin Lv, Junchi Yan, Li Shang, Stephen Chu

Linking losses for density ratio and class-probability estimation

Aditya Menon, Cheng Soon Ong

Stochastic Variance Reduction for Nonconvex Optimization

Sashank J. Reddi, Ahmed Hefny, Suvrit Sra, Barnabas Poczos, Alex Smola

Hierarchical Variational Models

Rajesh Ranganath, Dustin Tran, David Blei

Hierarchical Span-Based Conditional Random Fields for Labeling and Segmenting Events in Wearable Sensor Data Streams

Roy Adams, Nazir Saleheen, Edison Thomaz, Abhinav Parate, Santosh Kumar, Benjamin Marlin

Binary embeddings with structured hashed projections

Anna Choromanska, Krzysztof Choromanski, Mariusz Bojarski, Tony Jebara, Sanjiv Kumar, Yann LeCun

A Variational Analysis of Stochastic Gradient Algorithms

Stephan Mandt, Matthew Hoffman, David Blei

Adaptive Sampling for SGD by Exploiting Side Information

Siddharth Gopal

Learning from Multiway Data: Simple and Efficient Tensor Regression

Rose Yu, Yan Liu

A Distributed Variational Inference Framework for Unifying Parallel Sparse Gaussian Process Regression Models

Trong Nghia Hoang, Quang Minh Hoang, Bryan Kian Hsiang Low

Online Stochastic Linear Optimization under One-bit Feedback

Lijun Zhang, Tianbao Yang, Rong Jin, Yichi Xiao, Zhi-hua Zhou

Adaptive Algorithms for Online Convex Optimization with Long-term Constraints

Rodolphe Jenatton, Jim Huang, Cedric Archambeau

Actively Learning Hemimetrics with Applications to Eliciting User Preferences

Adish Singla, Sebastian Tschiatschek, Andreas Krause

Learning Simple Algorithms from Examples

Wojciech Zaremba, Tomas Mikolov, Armand Joulin, Rob Fergus

Learning Physical Intuition of Block Towers by Example

Adam Lerer, Sam Gross, Rob Fergus

Structure Learning of Partitioned Markov Networks

Song Liu, Taiji Suzuki, Masashi Sugiyama, Kenji Fukumizu

Tracking Slowly Moving Clairvoyant: Optimal Dynamic Regret of Online Learning with True and Noisy Gradient

Tianbao Yang, Lijun Zhang, Rong Jin, Jinfeng Yi

Beyond CCA: Moment Matching for Multi-View Models

Anastasia Podosinnikova, Francis Bach, Simon Lacoste-Julien

Fast methods for estimating the Numerical rank of large matrices

Shashanka Ubaru, Yousef Saad

Unsupervised Deep Embedding for Clustering Analysis

Junyuan Xie, Ross Girshick, Ali Farhadi

Efficient Private Empirical Risk Minimization for High-dimensional Learning

Shiva Prasad Kasiviswanathan, Hongxia Jin

Parameter Estimation for Generalized Thurstone Choice Models

Milan Vojnovic, Seyoung Yun

Large-Margin Softmax Loss for Convolutional Neural Networks

Weiyang Liu, Yandong Wen, Zhiding Yu, Meng Yang

A Random Matrix Approach to Echo-State Neural Networks

Romain Couillet, Gilles Wainrib, Hafiz Tiomoko Ali, Harry Sevi

Supervised and Semi-Supervised Text Categorization using LSTM for Region Embeddings

Rie Johnson, Tong Zhang

Optimality of Belief Propagation for Crowdsourced Classification

Jungseul Ok, Sewoong Oh, Jinwoo Shin, Yung Yi

Stability of Controllers for Gaussian Process Forward Models

Julia Vinogradska, Bastian Bischoff, Duy Nguyen-Tuong, Anne Romer, Henner Schmidt, Jan Peters

Learning privately from multiparty data

Jihun Hamm, Yingjun Cao, Mikhail Belkin

Network Morphism

Tao Wei, Changhu Wang, Yong Rui, Chang Wen Chen

A Kronecker-factored approximate Fisher matrix for convolution layers

Roger Grosse, James Martens

Experimental Design on a Budget for Sparse Linear Models and Applications

Sathya Narayanan Ravi, Vamsi Ithapu, Sterling Johnson, Vikas Singh

Minding the Gaps for Block Frank-Wolfe Optimization of Structured SVMs

Anton Osokin, Jean-Baptiste Alayrac, Isabella Lukasewitz, Puneet Dokania, Simon Lacoste-Julien

Exact Exponent in Optimal Rates for Crowdsourcing

Chao Gao, Yu Lu, Dengyong Zhou

Augmenting Supervised Neural Networks with Unsupervised Objectives for Large-scale Image Classification

Yuting Zhang, Kibok Lee, Honglak Lee

Online Low-Rank Subspace Clustering by Basis Dictionary Pursuit

Jie Shen, Ping Li, Huan Xu

A Self-Correcting Variable-Metric Algorithm for Stochastic Optimization

Frank Curtis

Stochastic Quasi-Newton Langevin Monte Carlo

Umut Simsekli, Roland Badeau, Taylan Cemgil, Gaël Richard

Doubly Robust Off-policy Value Evaluation for Reinforcement Learning

Nan Jiang, Lihong Li

Fast Rate Analysis of Some Stochastic Optimization Algorithms

Chao Qu, Huan Xu, Chong jin Ong

Fast k-Nearest Neighbour Search via Dynamic Continuous Indexing

Ke Li, Jitendra Malik

Smooth Imitation Learning for Online Sequence Prediction

Hoang Le, Andrew Kang, Yisong Yue, Peter Carr

Community Recovery in Graphs with Locality

Yuxin Chen, Govinda Kamath, Changho Suh, David Tse

Variance Reduction for Faster Non-Convex Optimization

Zeyuan Allen-Zhu, Elad Hazan

Loss factorization, weakly supervised learning and label noise robustness

Giorgio Patrini, Frank Nielsen, Richard Nock, Marcello Carioni

Analysis of Deep Neural Networks with Extended Data Jacobian Matrix

Shengjie Wang, Abdel-rahman Mohamed, Rich Caruana, Jeff Bilmes, Matthai Plilipose, Matthew Richardson, Krzysztof Geras, Gregor Urban, Ozlem Aslan

Doubly Decomposing Nonparametric Tensor Regression

Masaaki Imaizumi, Kohei Hayashi

Hyperparameter optimization with approximate gradient

Fabian Pedregosa

SDCA without Duality, Regularization, and Individual Convexity

Shai Shalev-Shwartz

Heteroscedastic Sequences: Beyond Gaussianity

Oren Anava, Shie Mannor

A Neural Autoregressive Approach to Collaborative Filtering

Yin Zheng, Bangsheng Tang, Wenkui Ding, Hanning Zhou

On the Quality of the Initial Basin in Overspecified Neural Networks

Itay Safran, Ohad Shamir

Primal-Dual Rates and Certificates

Celestine Dünner, Simone Forte, Martin Takac, Martin Jaggi

Minimizing the Maximal Loss: How and Why

Shai Shalev-Shwartz, Yonatan Wexler

The Information-Theoretic Requirements of Subspace Clustering with Missing Data

Daniel Pimentel-Alarcon, Robert Nowak

Online Learning with Feedback Graphs Without the Graphs

Alon Cohen, Tamir Hazan, Tomer Koren

PAC learning of Probabilistic Automaton based on the Method of Moments

Hadrien Glaude, Olivier Pietquin

Estimating Structured Vector Autoregressive Models

Igor Melnyk, Arindam Banerjee

Mixing Rates for the Alternating Gibbs Sampler over Restricted Boltzmann Machines and Friends

Christopher Tosh

Polynomial Networks and Factorization Machines: New Insights and Efficient Training Algorithms

Mathieu Blondel, Masakazu Ishihata, Akinori Fujino, Naonori Ueda

A New PAC-Bayesian Perspective on Domain Adaptation

Pascal Germain, Amaury Habrard, François Laviolette, Emilie Morvant

Correlation Clustering and Biclustering with Locally Bounded Errors

Gregory Puleo, Olgica Milenkovic

PAC Lower Bounds and Efficient Algorithms for The Max \(K\)-Armed Bandit Problem

Yahel David, Nahum Shimkin

A Comparative Analysis and Study of Multiview CNN Models for Joint Object Categorization and Pose Estimation

Mohamed Elhoseiny, Tarek El-Gaaly, Amr Bakry, Ahmed Elgammal

BASC: Applying Bayesian Optimization to the Search for Global Minima on Potential Energy Surfaces

Shane Carr, Roman Garnett, Cynthia Lo

On the Iteration Complexity of Oblivious First-Order Optimization Algorithms

Yossi Arjevani, Ohad Shamir

Stochastic Variance Reduced Optimization for Nonconvex Sparse Learning

Xingguo Li, Tuo Zhao, Raman Arora, Han Liu, Jarvis Haupt

Analysis of Variational Bayesian Factorizations for Sparse and Low-Rank Estimation

David Wipf

Fast k-means with accurate bounds

James Newling, Francois Fleuret

Boolean Matrix Factorization and Noisy Completion via Message Passing

Siamak Ravanbakhsh, Barnabas Poczos, Russell Greiner

Convolutional Rectifier Networks as Generalized Tensor Decompositions

Nadav Cohen, Amnon Shashua

Low-rank Solutions of Linear Matrix Equations via Procrustes Flow

Stephen Tu, Ross Boczar, Max Simchowitz, Mahdi Soltanolkotabi, Ben Recht

Anytime Exploration for Multi-armed Bandits using Confidence Information

Kwang-Sung Jun, Robert Nowak

Structured Prediction Energy Networks

David Belanger, Andrew McCallum

L1-regularized Neural Networks are Improperly Learnable in Polynomial Time

Yuchen Zhang, Jason D. Lee, Michael I. Jordan

Compressive Spectral Clustering

Nicolas Tremblay, Gilles Puy, Remi Gribonval, Pierre Vandergheynst

Low-rank tensor completion: a Riemannian manifold preconditioning approach

Hiroyuki Kasai, Bamdev Mishra

Provable Non-convex Phase Retrieval with Outliers: Median TruncatedWirtinger Flow

Huishuai Zhang, Yuejie Chi, Yingbin Liang

Estimating Maximum Expected Value through Gaussian Approximation

Carlo D’Eramo, Marcello Restelli, Alessandro Nuara

Representational Similarity Learning with Application to Brain Networks

Urvashi Oswal, Christopher Cox, Matthew Lambon-Ralph, Timothy Rogers, Robert Nowak

Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning

Yarin Gal, Zoubin Ghahramani

Generative Adversarial Text to Image Synthesis

Scott Reed, Zeynep Akata, Xinchen Yan, Lajanugen Logeswaran, Bernt Schiele, Honglak Lee

Dirichlet Process Mixture Model for Correcting Technical Variation in Single-Cell Gene Expression Data

Sandhya Prabhakaran, Elham Azizi, Ambrose Carr, Dana Pe’er

Improved SVRG for Non-Strongly-Convex or Sum-of-Non-Convex Objectives

Zeyuan Allen-Zhu, Yang Yuan

Sparse Parameter Recovery from Aggregated Data

Avradeep Bhowmik, Joydeep Ghosh, Oluwasanmi Koyejo

Deep Structured Energy Based Models for Anomaly Detection

Shuangfei Zhai, Yu Cheng, Weining Lu, Zhongfei Zhang

Even Faster Accelerated Coordinate Descent Using Non-Uniform Sampling

Zeyuan Allen-Zhu, Zheng Qu, Peter Richtarik, Yang Yuan

Unitary Evolution Recurrent Neural Networks

Martin Arjovsky, Amar Shah, Yoshua Bengio

Markov Latent Feature Models

Aonan Zhang, John Paisley

The Knowledge Gradient for Sequential Decision Making with Stochastic Binary Feedbacks

Yingfei Wang, Chu Wang, Warren Powell

A Simple and Provable Algorithm for Sparse Diagonal CCA

Megasthenis Asteris, Anastasios Kyrillidis, Oluwasanmi Koyejo, Russell Poldrack

Quadratic Optimization with Orthogonality Constraints: Explicit Lojasiewicz Exponent and Linear Convergence of Line-Search Methods

Huikang Liu, Weijie Wu, Anthony Man-Cho So

Normalization Propagation: A Parametric Technique for Removing Internal Covariate Shift in Deep Networks

Devansh Arpit, Yingbo Zhou, Bhargava Kota, Venu Govindaraju

Learning to Generate with Memory

Chongxuan Li, Jun Zhu, Bo Zhang

Learning End-to-end Video Classification with Rank-Pooling

Basura Fernando, Stephen Gould

Learning to Filter with Predictive State Inference Machines

Wen Sun, Arun Venkatraman, Byron Boots, J.Andrew Bagnell

A Subspace Learning Approach for High Dimensional Matrix Decomposition with Efficient Column/Row Sampling

Mostafa Rahmani, Geroge Atia

DCM Bandits: Learning to Rank with Multiple Clicks

Sumeet Katariya, Branislav Kveton, Csaba Szepesvari, Zheng Wen

Train faster, generalize better: Stability of stochastic gradient descent

Moritz Hardt, Ben Recht, Yoram Singer

Copeland Dueling Bandit Problem: Regret Lower Bound, Optimal Algorithm, and Computationally Efficient Algorithm

Junpei Komiyama, Junya Honda, Hiroshi Nakagawa

Contextual Combinatorial Cascading Bandits

Shuai Li, Baoxiang Wang, Shengyu Zhang, Wei Chen

Conservative Bandits

Yifan Wu, Roshan Shariff, Tor Lattimore, Csaba Szepesvari

Variance-Reduced and Projection-Free Stochastic Optimization

Elad Hazan, Haipeng Luo

Factored Temporal Sigmoid Belief Networks for Sequence Learning

Jiaming Song, Zhe Gan, Lawrence Carin

False Discovery Rate Control and Statistical Quality Assessment of Annotators in Crowdsourced Ranking

QianQian Xu, Jiechao Xiong, Xiaochun Cao, Yuan Yao

Strongly-Typed Recurrent Neural Networks

David Balduzzi, Muhammad Ghifary

Distributed Clustering of Linear Bandits in Peer to Peer Networks

Nathan Korda, Balazs Szorenyi, Shuai Li

Collapsed Variational Inference for Sum-Product Networks

Han Zhao, Tameem Adel, Geoff Gordon, Brandon Amos

On the Analysis of Complex Backup Strategies in Monte Carlo Tree Search

Piyush Khandelwal, Elad Liebman, Scott Niekum, Peter Stone

Benchmarking Deep Reinforcement Learning for Continuous Control

Yan Duan, Xi Chen, Rein Houthooft, John Schulman, Pieter Abbeel

\(K\)-Means Clustering with Distributed Dimensions

Hu Ding, Yu Liu, Lingxiao Huang, Jian Li

Texture Networks: Feed-forward Synthesis of Textures and Stylized Images

Dmitry Ulyanov, Vadim Lebedev, Andrea, Victor Lempitsky

Fast Constrained Submodular Maximization: Personalized Data Summarization

Baharan Mirzasoleiman, Ashwinkumar Badanidiyuru, Amin Karbasi

On the Statistical Limits of Convex Relaxations

Zhaoran Wang, Quanquan Gu, Han Liu

Ask Me Anything: Dynamic Memory Networks for Natural Language Processing

Ankit Kumar, Ozan Irsoy, Peter Ondruska, Mohit Iyyer, James Bradbury, Ishaan Gulrajani, Victor Zhong, Romain Paulus, Richard Socher

Gossip Dual Averaging for Decentralized Optimization of Pairwise Functions

Igor Colin, Aurelien Bellet, Joseph Salmon, Stéphan Clémençon

Solving Ridge Regression using Sketched Preconditioned SVRG

Alon Gonen, Francesco Orabona, Shai Shalev-Shwartz

Cumulative Prospect Theory Meets Reinforcement Learning: Prediction and Control

Prashanth L.A., Cheng Jie, Michael Fu, Steve Marcus, Csaba Szepesvari

Estimating Accuracy from Unlabeled Data: A Bayesian Approach

Emmanouil Antonios Platanios, Avinava Dubey, Tom Mitchell

Non-negative Matrix Factorization under Heavy Noise

Chiranjib Bhattacharya, Navin Goyal, Ravindran Kannan, Jagdeep Pani

Extreme F-measure Maximization using Sparse Probability Estimates

Kalina Jasinska, Krzysztof Dembczynski, Robert Busa-Fekete, Karlson Pfannschmidt, Timo Klerx, Eyke Hullermeier

Auxiliary Deep Generative Models

Lars Maaløe, Casper Kaae Sønderby, Søren Kaae Sønderby, Ole Winther

Importance Sampling Tree for Large-scale Empirical Expectation

Olivier Canevet, Cijo Jose, Francois Fleuret

Starting Small - Learning with Adaptive Sample Sizes

Hadi Daneshmand, Aurelien Lucchi, Thomas Hofmann

Deep Gaussian Processes for Regression using Approximate Expectation Propagation

Thang Bui, Daniel Hernandez-Lobato, Jose miguel Hernandez-Lobato, Yingzhen Li, Richard Turner

DR-ABC: Approximate Bayesian Computation with Kernel-Based Distribution Regression

Jovana Mitrovic, Dino Sejdinovic, Yee-Whye Teh

Predictive Entropy Search for Multi-objective Bayesian Optimization

Daniel Hernandez-Lobato, Jose miguel Hernandez-Lobato, Amar Shah, Ryan Adams

Rich Component Analysis

Rong Ge, James Zou

Black-Box Alpha Divergence Minimization

Jose miguel Hernandez-Lobato, Yingzhen Li, Mark Rowland, Thang Bui, Daniel Hernandez-Lobato, Richard Turner

One-Shot Generalization in Deep Generative Models

Danilo Rezende, Shakir, Ivo Danihelka, Karol Gregor, Daan Wierstra

Optimal Classification with Multivariate Losses

Nagarajan Natarajan, Oluwasanmi Koyejo, Pradeep Ravikumar, Inderjit Dhillon

A ranking approach to global optimization

Cedric Malherbe, Emile Contal, Nicolas Vayatis

Parallel and Distributed Block-Coordinate Frank-Wolfe Algorithms

Yu-Xiang Wang, Veeranjaneyulu Sadhanala, Wei Dai, Willie Neiswanger, Suvrit Sra, Eric Xing

Autoencoding beyond pixels using a learned similarity metric

Anders Boesen Lindbo Larsen, Søren Kaae Sønderby, Hugo Larochelle, Ole Winther

Ensuring Rapid Mixing and Low Bias for Asynchronous Gibbs Sampling

Christopher De Sa, Chris Re, Kunle Olukotun

Simultaneous Safe Screening of Features and Samples in Doubly Sparse Modeling

Atsushi Shibagaki, Masayuki Karasuyama, Kohei Hatano, Ichiro Takeuchi

Anytime optimal algorithms in stochastic multi-armed bandits

Rémy Degenne, Vianney Perchet

Bounded Off-Policy Evaluation with Missing Data for Course Recommendation and Curriculum Design

William Hoiles, Mihaela van der Schaar

On collapsed representation of hierarchical Completely Random Measures

Gaurav Pandey, Ambedkar Dukkipati

From Softmax to Sparsemax: A Sparse Model of Attention and Multi-Label Classification

Andre Martins, Ramon Astudillo

Black-box Optimization with a Politician

Sebastien Bubeck, Yin Tat Lee

Gaussian process nonparametric tensor estimator and its minimax optimality

Heishiro Kanagawa, Taiji Suzuki, Hayato Kobayashi, Nobuyuki Shimizu, Yukihiro Tagami

No-Regret Algorithms for Heavy-Tailed Linear Bandits

Andres Munoz Medina, Scott Yang

Extended and Unscented Kitchen Sinks

Edwin Bonilla, Daniel Steinberg, Alistair Reid

Matrix Eigen-decomposition via Doubly Stochastic Riemannian Optimization

Zhiqiang Xu, Peilin Zhao, Jianneng Cao, Xiaoli Li

Recommendations as Treatments: Debiasing Learning and Evaluation

Tobias Schnabel, Adith Swaminathan, Ashudeep Singh, Navin Chandak, Thorsten Joachims

ForecastICU: A Prognostic Decision Support System for Timely Prediction of Intensive Care Unit Admission

Jinsung Yoon, Ahmed Alaa, Scott Hu, Mihaela van der Schaar

An optimal algorithm for the Thresholding Bandit Problem

Andrea Locatelli, Maurilio Gutzeit, Alexandra Carpentier

Fast Parameter Inference in Nonlinear Dynamical Systems using Iterative Gradient Matching

Mu Niu, Simon Rogers, Maurizio Filippone, Dirk Husmeier

Structured and Efficient Variational Deep Learning with Matrix Gaussian Posteriors

Christos Louizos, Max Welling

Learning Granger Causality for Hawkes Processes

Hongteng Xu, Mehrdad Farajtabar, Hongyuan Zha

Neural Variational Inference for Text Processing

Yishu Miao, Lei Yu, Phil Blunsom

Dictionary Learning for Massive Matrix Factorization

Arthur Mensch, Julien Mairal, Bertrand Thirion, Gael Varoquaux

Pixel Recurrent Neural Networks

Aaron Van den Oord, Nal Kalchbrenner, Koray Kavukcuoglu

Why Most Decisions Are Easy in Tetris—And Perhaps in Other Sequential Decision Problems, As Well

Ozgur Simsek, Simon Algorta, Amit Kothiyal

Gaussian quadrature for matrix inverse forms with applications

Chengtao Li, Suvrit Sra, Stefanie Jegelka

Train and Test Tightness of LP Relaxations in Structured Prediction

Ofer Meshi, Mehrdad Mahdavi, Adrian Weller, David Sontag

Stochastic Optimization for Multiview Representation Learning using Partial Least Squares

Raman Arora, Poorya Mianjy, Teodor Marinov

Hierarchical Compound Poisson Factorization

Mehmet Basbug, Barbara Engelhardt

Opponent Modeling in Deep Reinforcement Learning

He He, Jordan Boyd-Graber, Kevin Kwok, Hal Daumé III

No penalty no tears: Least squares in high-dimensional linear models

Xiangyu Wang, David Dunson, Chenlei Leng

SDNA: Stochastic Dual Newton Ascent for Empirical Risk Minimization

Zheng Qu, Peter Richtarik, Martin Takac, Olivier Fercoq

On Graduated Optimization for Stochastic Non-Convex Problems

Elad Hazan, Kfir Yehuda Levy, Shai Shalev-Shwartz

Meta-Learning with Memory-Augmented Neural Networks

Adam Santoro, Sergey Bartunov, Matthew Botvinick, Daan Wierstra, Timothy Lillicrap

The knockoff filter for FDR control in group-sparse and multitask regression

Ran Dai, Rina Barber

Softened Approximate Policy Iteration for Markov Games

Julien Pérolat, Bilal Piot, Matthieu Geist, Bruno Scherrer, Olivier Pietquin

Stochastic Block BFGS: Squeezing More Curvature out of Data

Robert Gower, Donald Goldfarb, Peter Richtarik

Differential Geometric Regularization for Supervised Learning of Classifiers

Qinxun Bai, Steven Rosenberg, Zheng Wu, Stan Sclaroff

Exploiting Cyclic Symmetry in Convolutional Neural Networks

Sander Dieleman, Jeffrey De Fauw, Koray Kavukcuoglu

Graying the black box: Understanding DQNs

Tom Zahavy, Nir Ben-Zrihem, Shie Mannor

The Sum-Product Theorem: A Foundation for Learning Tractable Models

Abram Friesen, Pedro Domingos

Pareto Frontier Learning with Expensive Correlated Objectives

Amar Shah, Zoubin Ghahramani

Asynchronous Methods for Deep Reinforcement Learning

Volodymyr Mnih, Adria Puigdomenech Badia, Mehdi Mirza, Alex Graves, Timothy Lillicrap, Tim Harley, David Silver, Koray Kavukcuoglu

A Simple and Strongly-Local Flow-Based Method for Cut Improvement

Nate Veldt, David Gleich, Michael Mahoney

Nonlinear Statistical Learning with Truncated Gaussian Graphical Models

Qinliang Su, Xuejun Liao, Changyou Chen, Lawrence Carin

Barron and Cover’s Theory in Supervised Learning and its Application to Lasso

Masanori Kawakita, Jun’ichi Takeuchi

Nonparametric Canonical Correlation Analysis

Tomer Michaeli, Weiran Wang, Karen Livescu

BISTRO: An Efficient Relaxation-Based Method for Contextual Bandits

Alexander Rakhlin, Karthik Sridharan

Associative Long Short-Term Memory

Ivo Danihelka, Greg Wayne, Benigno Uria, Nal Kalchbrenner, Alex Graves

Dueling Network Architectures for Deep Reinforcement Learning

Ziyu Wang, Tom Schaul, Matteo Hessel, Hado van Hasselt, Marc Lanctot, Nando de Freitas

Persistence weighted Gaussian kernel for topological data analysis

Genki Kusano, Yasuaki Hiraoka, Kenji Fukumizu

Learning Convolutional Neural Networks for Graphs

Mathias Niepert, Mohamed Ahmed, Konstantin Kutzkov

Persistent RNNs: Stashing Recurrent Weights On-Chip

Greg Diamos, Shubho Sengupta, Bryan Catanzaro, Mike Chrzanowski, Adam Coates, Erich Elsen, Jesse Engel, Awni Hannun, Sanjeev Satheesh

Recurrent Orthogonal Networks and Long-Memory Tasks

Mikael Henaff, Arthur Szlam, Yann LeCun

The Arrow of Time in Multivariate Time Series

Stefan Bauer, Bernhard Schölkopf, Jonas Peters

Mixture Proportion Estimation via Kernel Embeddings of Distributions

Harish Ramaswamy, Clayton Scott, Ambuj Tewari

Fast DPP Sampling for Nystrom with Application to Kernel Methods

Chengtao Li, Stefanie Jegelka, Suvrit Sra

Complex Embeddings for Simple Link Prediction

Théo Trouillon, Johannes Welbl, Sebastian Riedel, Eric Gaussier, Guillaume Bouchard

Interactive Bayesian Hierarchical Clustering

Sharad Vikram, Sanjoy Dasgupta

A Convolutional Attention Network for Extreme Summarization of Source Code

Miltiadis Allamanis, Hao Peng, Charles Sutton

How to Fake Multiply by a Gaussian Matrix

Michael Kapralov, Vamsi Potluru, David Woodruff

Differentially Private Chi-Squared Hypothesis Testing: Goodness of Fit and Independence Testing

Marco Gaboardi, Hyun Lim, Ryan Rogers, Salil Vadhan

Pliable Rejection Sampling

Akram Erraqabi, Michal Valko, Alexandra Carpentier, Odalric Maillard

Differentially Private Policy Evaluation

Borja Balle, Maziar Gomrokchi, Doina Precup

Data-Efficient Off-Policy Policy Evaluation for Reinforcement Learning

Philip Thomas, Emma Brunskill

Discrete Deep Feature Extraction: A Theory and New Architectures

Thomas Wiatowski, Michael Tschannen, Aleksandar Stanic, Philipp Grohs, Helmut Boelcskei

Efficient Algorithms for Adversarial Contextual Learning

Vasilis Syrgkanis, Akshay Krishnamurthy, Robert Schapire

Training Deep Neural Networks via Direct Loss Minimization

Yang Song, Alexander Schwing, Richard, Raquel Urtasun

Sequence to Sequence Training of CTC-RNNs with Partial Windowing

Kyuyeon Hwang, Wonyong Sung

Variational Inference for Monte Carlo Objectives

Andriy Mnih, Danilo Rezende

Hierarchical Decision Making In Electricity Grid Management

Gal Dalal, Elad Gilboa, Shie Mannor

Learning Sparse Combinatorial Representations via Two-stage Submodular Maximization

Eric Balkanski, Baharan Mirzasoleiman, Andreas Krause, Yaron Singer

Understanding and Improving Convolutional Neural Networks via Concatenated Rectified Linear Units

Wenling Shang, Kihyuk Sohn, Diogo Almeida, Honglak Lee

Isotonic Hawkes Processes

Yichen Wang, Bo Xie, Nan Du, Le Song

Cross-Graph Learning of Multi-Relational Associations

Hanxiao Liu, Yiming Yang

Markov-modulated Marked Poisson Processes for Check-in Data

Jiangwei Pan, Vinayak Rao, Pankaj Agarwal, Alan Gelfand

Beyond Parity Constraints: Fourier Analysis of Hash Functions for Inference

Tudor Achim, Ashish Sabharwal, Stefano Ermon

On the Power and Limits of Distance-Based Learning

Periklis Papakonstantinou, Jia Xu, Guang Yang

A Convex Atomic-Norm Approach to Multiple Sequence Alignment and Motif Discovery

Ian En-Hsu Yen, Xin Lin, Jiong Zhang, Pradeep Ravikumar, Inderjit Dhillon

Generalized Direct Change Estimation in Ising Model Structure

Farideh Fazayeli, Arindam Banerjee

Robust Principal Component Analysis with Side Information

Kai-Yang Chiang, Cho-Jui Hsieh, Inderjit Dhillon

Towards Faster Rates and Oracle Property for Low-Rank Matrix Estimation

Huan Gui, Jiawei Han, Quanquan Gu

Early and Reliable Event Detection Using Proximity Space Representation

Maxime Sangnier, Jerome Gauthier, Alain Rakotomamonjy

Stratified Sampling Meets Machine Learning

Edo Liberty, Kevin Lang, Konstantin Shmakov

Efficient Multi-Instance Learning for Activity Recognition from Time Series Data Using an Auto-Regressive Hidden Markov Model

Xinze Guan, Raviv Raich, Weng-Keen Wong

Generalization Properties and Implicit Regularization for Multiple Passes SGM

Junhong Lin, Raffaello Camoriano, Lorenzo Rosasco

Principal Component Projection Without Principal Component Analysis

Roy Frostig, Cameron Musco, Christopher Musco, Aaron Sidford

Recovery guarantee of weighted low-rank approximation via alternating minimization

Yuanzhi Li, Yingyu Liang, Andrej Risteski

Deconstructing the Ladder Network Architecture

Mohammad Pezeshki, Linxi Fan, Philemon Brakel, Aaron Courville, Yoshua Bengio

Generalization and Exploration via Randomized Value Functions

Ian Osband, Benjamin Van Roy, Zheng Wen

Evasion and Hardening of Tree Ensemble Classifiers

Alex Kantchelian, J. D. Tygar, Anthony Joseph

Dynamic Memory Networks for Visual and Textual Question Answering

Caiming Xiong, Stephen Merity, Richard Socher

Estimating Cosmological Parameters from the Dark Matter Distribution

Siamak Ravanbakhsh, Junier Oliva, Sebastian Fromenteau, Layne Price, Shirley Ho, Jeff Schneider, Barnabas Poczos

Learning Population-Level Diffusions with Generative RNNs

Tatsunori Hashimoto, David Gifford, Tommi Jaakkola

Expressiveness of Rectifier Networks

Xingyuan Pan, Vivek Srikumar

Discrete Distribution Estimation under Local Privacy

Peter Kairouz, Keith Bonawitz, Daniel Ramage

Square Root Graphical Models: Multivariate Generalizations of Univariate Exponential Families that Permit Positive Dependencies

David Inouye, Pradeep Ravikumar, Inderjit Dhillon

A Box-Constrained Approach for Hard Permutation Problems

Cong Han Lim, Steve Wright

Geometric Mean Metric Learning

Pourya Zadeh, Reshad Hosseini, Suvrit Sra

Sparse Nonlinear Regression: Parameter Estimation under Nonconvexity

Zhuoran Yang, Zhaoran Wang, Han Liu, Yonina Eldar, Tong Zhang

Conditional Bernoulli Mixtures for Multi-label Classification

Cheng Li, Bingyu Wang, Virgil Pavlu, Javed Aslam

Scalable Discrete Sampling as a Multi-Armed Bandit Problem

Yutian Chen, Zoubin Ghahramani

Recycling Randomness with Structure for Sublinear time Kernel Expansions

Krzysztof Choromanski, Vikas Sindhwani

Bidirectional Helmholtz Machines

Jorg Bornschein, Samira Shabanian, Asja Fischer, Yoshua Bengio

Faster Convex Optimization: Simulated Annealing with an Efficient Universal Barrier

Jacob Abernethy, Elad Hazan

Preconditioning Kernel Matrices

Kurt Cutajar, Michael Osborne, John Cunningham, Maurizio Filippone

Greedy Column Subset Selection: New Bounds and Distributed Algorithms

Jason Altschuler, Aditya Bhaskara, Gang Fu, Vahab Mirrokni, Afshin Rostamizadeh, Morteza Zadimoghaddam

Dynamic Capacity Networks

Amjad Almahairi, Nicolas Ballas, Tim Cooijmans, Yin Zheng, Hugo Larochelle, Aaron Courville

Pricing a Low-regret Seller

Hoda Heidari, Mohammad Mahdian, Umar Syed, Sergei Vassilvitskii, Sadra Yazdanbod

Estimation from Indirect Supervision with Linear Moments

Aditi Raghunathan, Roy Frostig, John Duchi, Percy Liang

Speeding up k-means by approximating Euclidean distances via block vectors

Thomas Bottesch, Thomas Bühler, Markus Kächele

Learning and Inference via Maximum Inner Product Search

Stephen Mussmann, Stefano Ermon

A Superlinearly-Convergent Proximal Newton-type Method for the Optimization of Finite Sums

Anton Rodomanov, Dmitry Kropotov

A Kernel Test of Goodness of Fit

Kacper Chwialkowski, Heiko Strathmann, Arthur Gretton

Interacting Particle Markov Chain Monte Carlo

Tom Rainforth, Christian Naesseth, Fredrik Lindsten, Brooks Paige, Jan-Willem Vandemeent, Arnaud Doucet, Frank Wood

Faster Eigenvector Computation via Shift-and-Invert Preconditioning

Dan Garber, Elad Hazan, Chi Jin, Sham, Cameron Musco, Praneeth Netrapalli, Aaron Sidford

A Theory of Generative ConvNet

Jianwen Xie, Yang Lu, Song-Chun Zhu, Yingnian Wu

Efficient Learning with a Family of Nonconvex Regularizers by Redistributing Nonconvexity

Quanming Yao, James Kwok

Computationally Efficient Nyström Approximation using Fast Transforms

Si Si, Cho-Jui Hsieh, Inderjit Dhillon

Gromov-Wasserstein Averaging of Kernel and Distance Matrices

Gabriel Peyré, Marco Cuturi, Justin Solomon

Robust Monte Carlo Sampling using Riemannian Nosé-Poincaré Hamiltonian Dynamics

Anirban Roychowdhury, Brian Kulis, Srinivasan Parthasarathy

The Segmented iHMM: A Simple, Efficient Hierarchical Infinite HMM

Ardavan Saeedi, Matthew Hoffman, Matthew Johnson, Ryan Adams

Meta–Gradient Boosted Decision Tree Model for Weight and Target Learning

Yury Ustinovskiy, Valentina Fedorova, Gleb Gusev, Pavel Serdyukov

Discriminative Embeddings of Latent Variable Models for Structured Data

Hanjun Dai, Bo Dai, Le Song

Robust Random Cut Forest Based Anomaly Detection on Streams

Sudipto Guha, Nina Mishra, Gourav Roy, Okke Schrijvers

Training Neural Networks Without Gradients: A Scalable ADMM Approach

Gavin Taylor, Ryan Burmeister, Zheng Xu, Bharat Singh, Ankit Patel, Tom Goldstein

Clustering High Dimensional Categorical Data via Topographical Features

Chao Chen, Novi Quadrianto

Efficient Algorithms for Large-scale Generalized Eigenvector Computation and Canonical Correlation Analysis

Rong Ge, Chi Jin, Sham, Praneeth Netrapalli, Aaron Sidford

Algorithms for Optimizing the Ratio of Submodular Functions

Wenruo Bai, Rishabh Iyer, Kai Wei, Jeff Bilmes

Model-Free Imitation Learning with Policy Optimization

Jonathan Ho, Jayesh Gupta, Stefano Ermon

ADIOS: Architectures Deep In Output Space

Moustapha Cisse, Maruan Al-Shedivat, Samy Bengio

Conditional Dependence via Shannon Capacity: Axioms, Estimators and Applications

Weihao Gao, Sreeram Kannan, Sewoong Oh, Pramod Viswanath

Control of Memory, Active Perception, and Action in Minecraft

Junhyuk Oh, Valliappa Chockalingam, Satinder, Honglak Lee

The Label Complexity of Mixed-Initiative Classifier Training

Jina Suh, Xiaojin Zhu, Saleema Amershi

Bayesian Poisson Tucker Decomposition for Learning the Structure of International Relations

Aaron Schein, Mingyuan Zhou, David Blei, Hanna Wallach

Tensor Decomposition via Joint Matrix Schur Decomposition

Nicolo Colombo, Nikos Vlassis

Continuous Deep Q-Learning with Model-based Acceleration

Shixiang Gu, Timothy Lillicrap, Ilya Sutskever, Sergey Levine

Domain Adaptation with Conditional Transferable Components

Mingming Gong, Kun Zhang, Tongliang Liu, Dacheng Tao, Clark Glymour, Bernhard Schölkopf

Fixed Point Quantization of Deep Convolutional Networks

Darryl Lin, Sachin Talathi, Sreekanth Annapureddy

Provable Algorithms for Inference in Topic Models

Sanjeev Arora, Rong Ge, Frederic Koehler, Tengyu Ma, Ankur Moitra

Epigraph projections for fast general convex programming

Po-Wei Wang, Matt Wytock, Zico Kolter

Fast Algorithms for Segmented Regression

Jayadev Acharya, Ilias Diakonikolas, Jerry Li, Ludwig Schmidt

Energetic Natural Gradient Descent

Philip Thomas, Bruno Castro da Silva, Christoph Dann, Emma Brunskill

Partition Functions from Rao-Blackwellized Tempered Sampling

David Carlson, Patrick Stinson, Ari Pakman, Liam Paninski

Learning Mixtures of Plackett-Luce Models

Zhibing Zhao, Peter Piech, Lirong Xia

Near Optimal Behavior via Approximate State Abstraction

David Abel, David Hershkowitz, Michael Littman

Power of Ordered Hypothesis Testing

Lihua Lei, William Fithian

PHOG: Probabilistic Model for Code

Pavol Bielik, Veselin Raychev, Martin Vechev

Shifting Regret, Mirror Descent, and Matrices

Andras Gyorgy, Csaba Szepesvari

Scalable Gradient-Based Tuning of Continuous Regularization Hyperparameters

Jelena Luketina, Tapani Raiko, Mathias Berglund, Klaus Greff

Model-Free Trajectory Optimization for Reinforcement Learning

Riad Akrour, Gerhard Neumann, Hany Abdulsamad, Abbas Abdolmaleki

Controlling the distance to a Kemeny consensus without computing it

Yunlong Jiao, Anna Korba, Eric Sibony

Horizontally Scalable Submodular Maximization

Mario Lucic, Olivier Bachem, Morteza Zadimoghaddam, Andreas Krause

Group Equivariant Convolutional Networks

Taco Cohen, Max Welling

Stochastic Discrete Clenshaw-Curtis Quadrature

Nico Piatkowski, Katharina Morik

Correcting Forecasts with Multifactor Neural Attention

Matthew Riemer, Aditya Vempaty, Flavio Calmon, Fenno Heath, Richard Hull, Elham Khabiri

Learning Representations for Counterfactual Inference

Fredrik Johansson, Uri Shalit, David Sontag

Automatic Construction of Nonparametric Relational Regression Models for Multiple Time Series

Yunseong Hwang, Anh Tong, Jaesik Choi

Inference Networks for Sequential Monte Carlo in Graphical Models

Brooks Paige, Frank Wood

Slice Sampling on Hamiltonian Trajectories

Benjamin Bloem-Reddy, John Cunningham

Noisy Activation Functions

Caglar Gulcehre, Marcin Moczulski, Misha Denil, Yoshua Bengio

PD-Sparse : A Primal and Dual Sparse Approach to Extreme Multiclass and Multilabel Classification

Ian En-Hsu Yen, Xiangru Huang, Pradeep Ravikumar, Kai Zhong, Inderjit Dhillon