Value-Aware Loss Function for Model-based Reinforcement Learning

We consider the problem of estimating the transition probability kernel to be used by a model-based reinforcement learning (RL) algorithm. We argue that estimating a generative model that minimizes...

SGD: General Analysis and Improved Rates

We propose a general yet simple theorem describing the convergence of SGD under the arbitrary sampling paradigm. Our theorem describes the convergence of an infinite array of variants of SGD, each ...

Validation of Approximate Likelihood and Emulator Models for Computationally Intensive Simulations

Complex phenomena in engineering and the sciences are often modeled with computationally intensive feed-forward simulations for which a tractable analytic likelihood does not exist. In these cases,...

Proceedings of Machine Learning Research

Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics
Held in Online on 26-28 August 2020
Published as Volume 108 by the Proceedings of Machine Learning Research on 03 June 2020.
Volume Edited by:
...

Conservative Exploration in Reinforcement Learning

While learning in an unknown Markov Decision Process (MDP), an agent should trade off exploration to discover new information about the MDP, and exploitation of the current knowledge to maximize th...

Parallel Gibbs Sampling: From Colored Fields to Thin Junction Trees

We explore the task of constructing a parallel Gibbs sampler, to both improve mixing and the exploration of high likelihood states. Recent work in parallel Gibbs sampling has focused on update sche...

Continual Reinforcement Learning with Complex Synapses

Unlike humans, who are capable of continual learning over their lifetimes, artificial neural networks have long been known to suffer from a phenomenon known as catastrophic forgetting, whereby new ...

Policy Consolidation for Continual Reinforcement Learning

We propose a method for tackling catastrophic forgetting in deep reinforcement learning that is agnostic to the timescale of changes in the distribution of experiences, does not require knowledge o...

Parameter-Efficient Transfer Learning for NLP

Fine-tuning large pretrained models is an effective transfer mechanism in NLP. However, in the presence of many downstream tasks, fine-tuning is parameter inefficient: an entire new model is requir...

Interpretable Cascade Classifiers with Abstention

In many prediction tasks such as medical diagnostics, sequential decisions are crucial to provide optimal individual treatment. Budget in real-life applications is always limited, and it can repre...

CompILE: Compositional Imitation Learning and Execution

We introduce Compositional Imitation Learning and Execution (CompILE): a framework for learning reusable, variable-length segments of hierarchically-structured behavior from demonstration data. Com...

Unreproducible Research is Reproducible

The apparent contradiction in the title is a wordplay on the different meanings attributed to the word reproducible across different scientific fields. What we imply is that unreproducible findings...

kernelPSI: a Post-Selection Inference Framework for Nonlinear Variable Selection

Model selection is an essential task for many applications in scientific discovery. The most common approaches rely on univariate linear measures of association between each feature and the outcome...

Simple Regression Models

Developing theories of when and why simple predictive models
perform well is a key step in understanding decisions of cognitively
bounded humans and intelligent machines. We are interested in
ho...

Convex envelopes of complexity controlling penalties: the case against premature envelopment

Convex envelopes of the cardinality and rank function, l_1 and nuclear norm, have gained immense popularity due to their sparsity inducing properties. This gave rise to a natural approach to buil...