ICML 2019 深度強化學習文章匯總
「
本文原載於微信公眾號:深度強化學習演算法,AI研習社經授權轉載。歡迎關注深度學習強化演算法 微信公眾號及AI研習社博客專欄。
深度強化學習-Report
來源:icml2019 conference
強化學習是一種通用的學習、預測和決策範式。RL為順序決策問題提供了解決方法,並將其轉化為順序決策問題。RL與優化、統計學、博弈論、因果推理、序貫實驗等有著深刻的聯繫,與近似動態規劃和最優控制有著很大的重疊,在科學、工程和藝術領域有著廣泛的應用。
RL最近在學術界取得了穩定的進展,如Atari遊戲、AlphaGo、VisuoMotor機器人政策。RL也被應用於現實場景,如推薦系統和神經架構搜索。請參閱有關RL應用程序的最新集合。希望RL系統能夠在現實世界中工作,並具有實際的好處。然而,RL存在著許多問題,如泛化、樣本效率、勘探與開發困境等。因此,RL遠未被廣泛部署。對於RL社區來說,常見的、關鍵的和緊迫的問題是:RL是否有廣泛的部署?問題是什麼?如何解決這些問題?
方法類文章
Efficient Off-Policy Meta-Reinforcement Learning via Probabilistic Context Variables
Bayesian Action Decoder for Deep Multi-Agent Reinforcement Learning
Quantifying Generalization in Reinforcement Learning
Policy Certificates: Towards Accountable Reinforcement Learning
Neural Logic Reinforcement Learning
Probability Functional Descent: A Unifying Perspective on GANs, Variational Inference, and Reinforcement Learning
Few-Shot Intent Inference via Meta-Inverse Reinforcement Learning
Calibrated Model-Based Deep Reinforcement Learning
Information-Theoretic Considerations in Batch Reinforcement Learning
Taming MAML: Control variates for unbiased meta-reinforcement learning gradient estimation
Option Discovery for Solving Sparse Reward Reinforcement Learning Problems
優化類文章
Fingerprint Policy Optimisation for Robust Reinforcement Learning
Collaborative Evolutionary Reinforcement Learning
Composing Value Functions in Reinforcement Learning
Task-Agnostic Dynamics Priors for Deep Reinforcement Learning
Policy Consolidation for Continual Reinforcement Learning
探索-利用及模型參數
Exploration Conscious Reinforcement Learning Revisited
Dynamic Weights in Multi-Objective Deep Reinforcement Learning
Control Regularization for Reduced Variance Reinforcement Learning
Dead-ends and Secure Exploration in Reinforcement Learning
Off-Policy Deep Reinforcement Learning without Exploration
Dimension-Wise Importance Sampling Weight Clipping for Sample-Efficient Reinforcement Learning
Extrapolating Beyond Suboptimal Demonstrations via Inverse Reinforcement Learning from Observations
On the Generalization Gap in Reparameterizable Reinforcement Learning
多智能體
Social Influence as Intrinsic Motivation for Multi-Agent Deep Reinforcement Learning
CURIOUS: Intrinsically Motivated Multi-Task, Multi-Goal Reinforcement Learning
Finite-Time Analysis of Distributed TD(0) with Linear Function Approximation on Multi-Agent Reinforcement Learning
Maximum Entropy-Regularized Multi-Goal Reinforcement Learning
Multi-Agent Adversarial Inverse Reinforcement Learning
Grid-Wise Control for Multi-Agent Reinforcement Learning in Video Game AI
QTRAN: Learning to Factorize with Transformation for Cooperative Multi-Agent Reinforcement Learning
Actor-Attention-Critic for Multi-Agent Reinforcement Learning
圖模型強化學習
TibGM: A Transferable and Information-Based Graphical Model Approach for Reinforcement Learning
SOLAR: Deep Structured Representations for Model-Based Reinforcement Learning
分散式強化學習
Statistics and Samples in Distributional Reinforcement Learning
Distribution Reinforcement Learning for Efficient Exploration
應用類
Action Robust Reinforcement Learning and Applications in Continuous Control
Transfer Learning for Related Reinforcement Learning Tasks via Image-to-Image Translation
Learning Action Representations for Reinforcement Learning
The Value Function Polytope in Reinforcement Learning
Generative Adversarial User Model for Reinforcement Learning Based Recommendation System
其他
Kernel-Based Reinforcement Learning in Robust Markov Decision Processes
A Deep Reinforcement Learning Perspective on Internet Congestion Control
Reinforcement Learning in Configurable Continuous Environments
Tighter Problem-Dependent Regret Bounds in Reinforcement Learning without Domain Knowledge using Value Function Bounds
註:部分文章還沒有在arxiv上,或者沒有的需要自行Google
paper-PDF版本(資料獲取)
TAG:AI研習社 |