Using a single network architecture and fixed set of hyper-parameters, Recurrent Replay Distributed DQN quadruples prev SoTA on Atari-57, and matches SoTA on DMLab-30. It is the first agent to exceed human-level performance in 52 of the 57 Atari games.
Our new ICLR paper on LSTM training with distributed replay is camera ready! Great work from and on R2D2 with SOTA Atari-57 and DMLab-30:
With all the usual caveats (evaluation conditions vary/aren't well standardized, these aren't all comparable, this hasn't been reviewed/replicated, etc.), the median Atari-57 trendline seems to have just been broken. ICLR paper today and slide from May.
"Recurrent Experience Replay in Distributed Reinforcement Learning": Better Atari/DMLab performance by (simplifying a bit) slapping an RNN on it. Also, one interesting thing I didn't notice before is the use of common hyperparams across DMLab and Atari.