This is the first time I submitted a paper and got reviews with both 1 (very strong reject) and 8 (strong accept). I alway believe 1 is the highest praise for a paper!
Our team is hiring! We have been pushing the frontier of operation research by machine learning. @hanjundai and Dale are not only excellent researchers but also great mentors. Welcome to apply the position :)
We are exciting to present AdaPlanner. We show that the long-term decision making ability of a LLM-agent can be significantly improved by iterative closed-loop planning with code-style prompt and skill discovery.
Excited to introduce AdaPlanner, our LLM agent for solving embodied tasks via closed-loop planning.
Key features:
1) Adaptively refines LLM-generated plan from environment feedback, with both in-plan and out-of-plan refining strategies
2) A code-style LLM prompt structure to
A new and beautiful (and practical!) technique for computing confidence intervals of policy value in RL! arxiv.org/abs/2010.11652
This is a problem that I & collaborators have been thinking about for ~1 year. At the beginning, I didn't think such a nice result was possible... 1/
We propose a unified view to summarize the existing (DICE, MWL/MQL, LSTDQ, etc) and design new OPE estimators, under which a comprehensive examination in terms of the tradeoff between statistical and optimization has been conducted.
The schedule and accepted papers are released: optrl2019.github.io. Congratulations to all the recipients of the travel awards. We thank all the invited speakers, panelists and authors. Thanks to our sponsors @GoogleAI and @DeepMindAI. See you in Vancouver next week.