Publications
*: indicating equal contribution or alphabetic ordering.
2025
Minimax Optimal Regret Bound for Reinforcement Learning with Trajectory Feedback
Zihan Zhang, Yuxin Chen, Jason D. Lee, Simon S. Du, Ruosong Wang
International Conference on Machine Learning (ICML) 2025
2024
An Experimental Design Framework for Label-Efficient Supervised Finetuning of Large Language Models
Gantavya Bhatt*, Yifang Chen*, Arnav M. Das*, Jifan Zhang*, Sang T. Truong, Stephen Mussmann, Yinglun Zhu, Jeffrey Bilmes, Simon S. Du, Kevin Jamieson, Jordan T. Ash, Robert D. Nowak
Annual Meeting of the Association for Computational Linguistics (ACL) 2024, Findings
2023
2022
2021
2020
What Can Neural Networks Reason About?
Keyulu Xu, Jingling Li, Mozhi Zhang, Simon S. Du, Ken-ichi Kawarabayashi, Stefanie Jegelka
International Conference on Learning Representations (ICLR) 2020 (Spotlight)
2019
2018
2017
2016
2015
Preprints and Technical Reports
Reinforcement Learning for Reasoning in Large Language Models with One Training Example
Yiping Wang, Qing Yang, Zhiyuan Zeng, Liliang Ren, Lucas Liu, Baolin Peng, Hao Cheng, Xuehai He, Kuan Wang, Jianfeng Gao, Weizhu Chen, Shuohang Wang*, Simon S. Du*, Yelong Shen*
|