Alignment
2026
5
- Reward Hacking: When Optimizers Reverse-Search the Reward Signal
- The Evolution of Reward Design: From RLHF to RLVR
- Reinforcement Learning in LLM Alignment: From Reward Signals to Advantage Estimation
- The Essence of LLM Reasoning and Training: From Surrogates to Reinforcement Learning Geometry
- What Does the Loss Landscape of LLMs Look Like?
2025
4
1