PaperList 2024 Jan Analyze/Optimize Reflection AutoPlan: don't use demonstrations from human, collect feedback from the environment and generate reflections 2023 Dec Improve Planning abilities using RL RetroFormer: freezes the base LLM and trains reinforcement learning models to refine reflections through policy gradient methods ADAPTING LLM AGENTS THROUGH COMMUNICATION : applies PPO training directly to an open-source LLM based on feedback and agent exploration trajectories 2023 Dec Analyze/Optimize reflection Text2Reward: transforms feedback into code to minimize feedback ambiguity ExpeL: LLM Agents Are Experiential Learners: utilizes inter-task feedback from both successes and failures to enhance model learning. ALIGNING LANGUAGE MODELS WITH JUDGMENTS: create contrasting samples with correct/incorrect prediction and feedback to train LLM toget better alignment 2023 Nov Agent framework Reflexion: uses reflection to improve the performance, uses oracle to determine whether the reasoning should stop RAP: uses MCTS, uses LLM as world model RATS: combines MCTS, reward evaluation and reflection, in the LLM search process 2023 Oct Improve COT capabilities [] 2023 Oct Open Domain QA Datasets 2023 Oct Open Domain QA methods 2023 Sep Tool Use