Artykuł w czasopiśmie
Brak miniatury
Licencja
On Many-Actions Policy Gradient
dc.abstract.en | We study the variance of stochastic policy gradients (SPGs) with many action samples per state. We derive a many-actions optimality condition, which determines when many-actions SPG yields lower variance as compared to a single-action agent with proportionally extended trajectory. We propose Model-Based Many-Actions (MBMA), an approach leveraging dynamics models for many-actions sampling in the context of SPG. MBMA addresses issues associated with existing implementations of many-actions SPG and yields lower bias and comparable variance to SPG estimated from states in model-simulated rollouts. We find that MBMA bias and variance structure matches that predicted by theory. As a result, MBMA achieves improved sample efficiency and higher returns on a range of continuous action environments as compared to model-free, many-actions, and model-based on-policy SPG baselines. |
dc.affiliation | Uniwersytet Warszawski |
dc.conference.country | Stany Zjednoczone |
dc.conference.datefinish | 2023-07-29 |
dc.conference.datestart | 2023-07-23 |
dc.conference.place | Honolulu |
dc.conference.series | International Conference on Machine Learning |
dc.conference.series | International Conference on Machine Learning |
dc.conference.seriesshortcut | ICML |
dc.conference.shortcut | ICML 2023 |
dc.conference.weblink | https://icml.cc/Conferences/2023/Dates |
dc.contributor.author | Cygan, Marek |
dc.contributor.author | Nauman, Michal |
dc.date.accessioned | 2024-01-25T15:45:23Z |
dc.date.available | 2024-01-25T15:45:23Z |
dc.date.issued | 2023 |
dc.description.finance | Publikacja bezkosztowa |
dc.identifier.uri | https://repozytorium.uw.edu.pl//handle/item/114719 |
dc.identifier.weblink | https://proceedings.mlr.press/v202/nauman23a.html |
dc.language | eng |
dc.pbn.affiliation | computer and information sciences |
dc.relation.pages | 202:25769-25789 |
dc.rights | ClosedAccess |
dc.sciencecloud | nosend |
dc.title | On Many-Actions Policy Gradient |
dc.type | JournalArticle |
dspace.entity.type | Publication |