Attention-based recurrence for multi-agent reinforcement learning under stochastic partial observability

Thomy Phan, Fabian Ritz, Philipp Altmann, Maximilian Zorn, Jonas Nüßlein, Michael Kölle, Thomas Gabor and Claudia Linnhoff-Popien

10.5555/3618408.3619565

URL

Abstract: Stochastic partial observability poses a major challenge for decentralized coordination in multiagent reinforcement learning but is largely neglected in state-of-the-art research due to a strong focus on state-based centralized training for decentralized execution (CTDE) and benchmarks that lack sufficient stochasticity like StarCraft Multi-Agent Challenge (SMAC). In this paper, we propose Attention-based Embeddings of Recurrence In multi-Agent Learning (AERIAL) to approximate value functions under stochastic partial observability. AERIAL replaces the true state with a learned representation of multi-agent recurrence, considering more accurate information about decentralized agent decisions than state-based CTDE. We then introduce MessySMAC, a modified version of SMAC with stochastic observations and higher variance in initial states, to provide a more general and configurable benchmark regarding stochastic partial observability. We evaluate AERIAL in Dec-Tiger as well as in a variety of SMAC and MessySMAC maps, and compare the results with state-based CTDE. Furthermore, we evaluate the robustness of AERIAL and state-based CTDE against various stochasticity configurations in MessySMAC.

Proceedings of the 40th International Conference on Machine Learning (2023)

Citation:

Thomy Phan, Fabian Ritz, Philipp Altmann, Maximilian Zorn, Jonas Nüßlein, Michael Kölle, Thomas Gabor, Claudia Linnhoff-Popien. “Attention-based recurrence for multi-agent reinforcement learning under stochastic partial observability”. Proceedings of the 40th International Conference on Machine Learning 2023. DOI: 10.5555/3618408.3619565 [PDF] [Code]

Bibtex:

@inproceedings{Phan2023AttentionBasedRecurrence,
  author    = {Phan, Thomy and Ritz, Fabian and Altmann, Philipp and Zorn, Maximilian and Nüßlein, Jonas and Kölle, Michael and Gabor, Thomas and Linnhoff-Popien, Claudia},
  title     = {{Attention-based recurrence for multi-agent reinforcement learning under stochastic partial observability}},
  year      = {2023},
  publisher = {JMLR.org},
  abstract  = {{Stochastic partial observability poses a major challenge for decentralized coordination in multiagent reinforcement learning but is largely neglected in state-of-the-art research due to a strong focus on state-based centralized training for decentralized execution (CTDE) and benchmarks that lack sufficient stochasticity like StarCraft Multi-Agent Challenge (SMAC). In this paper, we propose Attention-based Embeddings of Recurrence In multi-Agent Learning (AERIAL) to approximate value functions under stochastic partial observability. AERIAL replaces the true state with a learned representation of multi-agent recurrence, considering more accurate information about decentralized agent decisions than state-based CTDE. We then introduce MessySMAC, a modified version of SMAC with stochastic observations and higher variance in initial states, to provide a more general and configurable benchmark regarding stochastic partial observability. We evaluate AERIAL in Dec-Tiger as well as in a variety of SMAC and MessySMAC maps, and compare the results with state-based CTDE. Furthermore, we evaluate the robustness of AERIAL and state-based CTDE against various stochasticity configurations in MessySMAC.}},
  booktitle = {Proceedings of the 40th International Conference on Machine Learning},
  articleno = {1157},
  doi       = {10.5555/3618408.3619565},
  numpages  = {14},
  location  = {Honolulu, Hawaii, USA},
  series    = {ICML'23},
  preprint  = {https://arxiv.org/abs/2301.01649},
  code      = {https://github.com/thomyphan/messy_smac},
  pdf       = {https://proceedings.mlr.press/v202/phan23a/phan23a.pdf},
  url       = {https://dl.acm.org/doi/abs/10.5555/3618408.3619565}
}