EscapeBench: Towards Advancing Creative Intelligence of Language Model Agents

Cheng Qian; Peixuan Han; Qinyu Luo; Bingxiang He; Xiusi Chen; Yuji Zhang; Hongyi Du; Jiarui Yao; Xiaocheng Yang; Denghui Zhang; Yunzhu Li; Heng Ji

doi:10.18653/v1/2025.acl-long.39

EscapeBench: Towards Advancing Creative Intelligence of Language Model Agents

Cheng Qian, Peixuan Han, Qinyu Luo, Bingxiang He, Xiusi Chen, Yuji Zhang, Hongyi Du, Jiarui Yao, Xiaocheng Yang, Denghui Zhang, Yunzhu Li, Heng Ji

Abstract

Language model agents excel in long-session planning and reasoning, but existing benchmarks primarily focus on goal-oriented tasks with explicit objectives, neglecting creative adaptation in unfamiliar environments. To address this, we introduce EscapeBench—a benchmark suite of room escape game environments designed to challenge agents with creative reasoning, unconventional tool use, and iterative problem-solving to uncover implicit goals. Our results show that current LM models, despite employing working memory and Chain-of-Thought reasoning, achieve only 15% average progress without hints, highlighting their limitations in creativity. To bridge this gap, we propose EscapeAgent, a framework designed to enhance creative reasoning through Foresight (innovative tool use) and Reflection (identifying unsolved tasks). Experiments show that EscapeAgent can execute action chains over 1,000 steps while maintaining logical coherence. It navigates and completes games with up to 40% fewer steps and hints, performs robustly across difficulty levels, and achieves higher action success rates with more efficient and innovative puzzle-solving strategies.

Anthology ID:: 2025.acl-long.39
Volume:: Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 798–820
Language:
URL:: https://aclanthology.org/2025.acl-long.39/
DOI:: 10.18653/v1/2025.acl-long.39
Bibkey:
Cite (ACL):: Cheng Qian, Peixuan Han, Qinyu Luo, Bingxiang He, Xiusi Chen, Yuji Zhang, Hongyi Du, Jiarui Yao, Xiaocheng Yang, Denghui Zhang, Yunzhu Li, and Heng Ji. 2025. EscapeBench: Towards Advancing Creative Intelligence of Language Model Agents. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 798–820, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: EscapeBench: Towards Advancing Creative Intelligence of Language Model Agents (Qian et al., ACL 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.acl-long.39.pdf

PDF Cite Search Fix data