NeurIPS 2019 Deep RL Workshop
Hear directly from presenters at the NeurIPS 2019 Deep RL Workshop on their work!
Thank you to all the presenters that participated. I covered as many as I could given the time and crowds, if you were not included and wish to be, please email
More details on the official NeurIPS Deep RL Workshop site.
More details on the official NeurIPS Deep RL Workshop site.
- 0:23 Approximating two value functions instead of one: towards characterizing a new family of Deep Reinforcement Learning algorithms; Matthia Sabatelli (University of Liege); Gilles Louppe (University of Liège); Pierre Geurts (University of Liège); Marco Wiering (University of Groningen) [external pdf link]
- 4:16 Single Deep Counterfactual Regret Minimization; Eric Steinberger (University of Cambridge).
- 5:38 On the Convergence of Episodic Reinforcement Learning Algorithms at the Example of RUDDER; Markus Holzleitner (LIT AI Lab, Institute for Machine Learning, Johannes Kepler University Linz, Austria); José Arjona-Medina (LIT AI Lab, Institute for Machine Learning, Johannes Kepler University Linz, Austria); Marius-Constantin Dinu (LIT AI Lab / University Linz ); Sepp Hochreiter (LIT AI Lab, Institute for Machine Learning, Johannes Kepler University Linz, Austria).
- 9:33 Objective Mismatch in Model-based Reinforcement Learning; Nathan Lambert (UC Berkeley); Brandon Amos (Facebook); Omry Yadan (Facebook); Roberto Calandra (Facebook).
- 10:51 Option Discovery using Deep Skill Chaining; Akhil Bagaria (Brown University); George Konidaris (Brown University).
- 13:44 Blue River Controls: A toolkit for Reinforcement Learning Control Systems on Hardware; Kirill Polzounov (University of Calgary); Ramitha Sundar (Blue River Technology); Lee Reden (Blue River Technology).
- 14:52 LeDeepChef: Deep Reinforcement Learning Agent for Families of Text-Based Games; Leonard Adolphs (ETHZ); Thomas Hofmann (ETH Zurich).
- 16:30 Accelerating Training in Pommerman with Imitation and Reinforcement Learning; Hardik Meisheri (TCS Research); Omkar Shelke (TCS Research); Richa Verma (TCS Research); Harshad Khadilkar (TCS Research).
- 17:27 Dream to Control: Learning Behaviors by Latent Imagination; Danijar Hafner (Google); Timothy Lillicrap (DeepMind); Jimmy Ba (University of Toronto); Mohammad Norouzi (Google Brain) [external pdf link].
- 20:48 Adaptive Temperature Tuning for Mellowmax in Deep Reinforcement Learning; Seungchan Kim (Brown University); George Konidaris (Brown).
- 22:05 Meta-learning curiosity algorithms; Ferran Alet (MIT); Martin Schneider (MIT); Tomas Lozano-Perez (MIT); Leslie Kaelbling (MIT).
- 24:09 Predictive Coding for Boosting Deep Reinforcement Learning with Sparse Rewards; Xingyu Lu (Berkeley); Stas Tiomkin (BAIR, UC Berkeley); Pieter Abbeel (UC Berkeley).
- 25:44 Swarm-inspired Reinforcement Learning via Collaborative Inter-agent Knowledge Distillation; Zhang-Wei Hong (Preferred Networks); Prabhat Nagarajan (Preferred Networks); Guilherme Maeda (Preferred Networks).
- 26:35 Multiplayer AlphaZero; Nicholas Petosa (Georgia Institute of Technology); Tucker Balch (Ga Tech) [external pdf link].
- 27:43 Prioritized Sequence Experience Replay; Marc Brittain (Iowa State University); Joshua Bertram (Iowa State University); Xuxi Yang (Iowa State University); Peng Wei (Iowa State University) [external pdf link].
- 29:14 Recurrent neural-linear posterior sampling for non-stationary bandits; Paulo Rauber (IDSIA); Aditya Ramesh (USI); Jürgen Schmidhuber (IDSIA - Lugano).
- 29:36 Improving Evolutionary Strategies With Past Descent Directions; Asier Mujika (ETH Zurich); Florian Meier (ETH Zurich); Marcelo Matheus Gauy (ETH Zurich); Angelika Steger (ETH Zurich) [external pdf link].
- 31:40 ZPD Teaching Strategies for Deep Reinforcement Learning from Demonstrations; Daniel Seita (University of California, Berkeley); David Chan (University of California, Berkeley); Roshan Rao (UC Berkeley); Chen Tang (UC Berkeley); Mandi Zhao (UC Berkeley); John Canny (UC Berkeley) [external pdf link].
- 33:05 Bottom-Up Meta-Policy Search; Luckeciano Melo (Aeronautics Institute of Technology); Marcos Máximo (Aeronautics Institute of Technology); Adilson Cunha (Aeronautics Institute of Technology) [external pdf link].
- 33:37 MERL: Multi-Head Reinforcement Learning; Yannis Flet-Berliac (University of Lille / Inria); Philippe Preux (INRIA) [external pdf link].
- 35:30 Emergent Tool Use from Multi-Agent Autocurricula; Bowen Baker (OpenAI); Ingmar Kanitscheider (OpenAI); Todor Markov (OpenAI); Yi Wu (UC Berkeley); Glenn Powell (OpenAI); Bob McGrew (OpenAI); Igor Mordatch ().
- 37:09 Learning an off-policy predictive state representation for deep reinforcement learning for vision-based steering in autonomous driving; Daniel Graves (Huawei)
- 39:37 Multi-Task Reinforcement Learning without Interference; Tianhe Yu (Stanford University); Saurabh Kumar (Stanford); Abhishek Gupta (UC Berkeley); Karol Hausman (Google Brain); Sergey Levine (UC Berkeley); Chelsea Finn (UC Berkeley).
- 40:52 Behavior-Regularized Offline Reinforcement Learning; Yifan Wu (Carnegie Mellon University); George Tucker (Google Brain); Ofir Nachum (Google) [external pdf link].
- 42:36 If MaxEnt RL is the Answer, What is the Question?; Ben Eysenbach (Carnegie Mellon University); Sergey Levine (UC Berkeley) [external pdf link].
- 43:30 Receiving Uncertainty-Aware Advice in Deep Reinforcement Learning; Felipe Leno da Silva (University of Sao Paulo); Pablo Hernandez-Leal (Borealis AI); Bilal Kartal (Borealis AI); Matthew Taylor (Borealis AI).
- 45:03 Striving for Simplicity in Off-Policy Deep Reinforcement Learning; Rishabh Agarwal (Google Research, Brain Team); Dale Schuurmans (Google / University of Alberta); Mohammad Norouzi (Google Brain) [external pdf link].
- 45:32 Interactive Fiction Games: A Colossal Adventure; Matthew Hausknecht (Microsoft Research); Prithviraj Ammanabrolu (Georgia Institute of Technology); Marc-Alexandre Côté (Microsoft Research); Xingdi Yuan (Microsoft Research) [external pdf link].
- 52:20 rlpyt: A Research Code Base for Deep Reinforcement Learning in PyTorch; Adam Stooke (UC Berkeley); Pieter Abbeel (UC Berkeley) [ Repo: ]
- 53:49 Learning to Drive using Waypoints; Tanmay Agarwal, Hitesh Arora, Tanvir Parhar, Shubhankar V Deshpande, Jeff Schneider - from the NeurIPS 2019 Workshop on Machine Learning for Autonomous Driving Workshop
Creators and Guests
Robin Ranjit Singh Chauhan
🌱 Head of Eng @AgFunder 🧠 AI:Reinforcement Learning/ML/DL/NLP🎙️Host @TalkRLPodcast 💳 ex-@Microsoft ecomm PgmMgr 🤖 @UWaterloo CompEng 🇨🇦 🇮🇳