Publications

*: indicating equal contribution or alphabetic ordering.

2025

Anytime Acceleration of Gradient Descent
Zihan Zhang, Jason D. Lee, Simon S. Du, Yuxin Chen
Conference of Learning Theory (COLT) 2025

Cross-environment Cooperation Enables Zero-shot Multi-agent Coordination
Kunal Jha, Wilka Carvalho, Yancheng Liang, Simon S. Du, Max Kleiman-Weiner*, Natasha Jaques*
International Conference on Machine Learning (ICML) 2025 (Spotlight)

Minimax Optimal Regret Bound for Reinforcement Learning with Trajectory Feedback
Zihan Zhang, Yuxin Chen, Jason D. Lee, Simon S. Du, Ruosong Wang
International Conference on Machine Learning (ICML) 2025

Is Your World Simulator a Good Story Presenter? A Consecutive Events-Based Benchmark for Future Long Video Generation
Yiping Wang, Xuehai He, Kuan Wang, Luyao Ma, Jianwei Yang, Shuohang Wang, Simon Shaolei Du, Yelong Shen
Conference on Computer Vision and Pattern Recognition (CVPR) 2025

The Crucial Role of Samplers in Online Direct Preference Optimization
Ruizhe Shi*, Runlong Zhou*, Simon S. Du
International Conference on Learning Representations (ICLR) 2025

Offline Multi-task Transfer RL with Representational Penalization
Avinandan Bose, Simon S. Du, Maryam Fazel
International Conference on Artificial Intelligence and Statistics (AISTATS) 2025

Settling the Sample Complexity of Online Reinforcement Learning
Zihan Zhang, Yuxin Chen, Jason D. Lee, Simon S. Du
Journal of the ACM (JACM) 2025 (appeared in part in COLT 2024)

2024

CLIPLoss and Norm-Based Data Selection Methods for Multimodal Contrastive Learning
Yiping Wang*, Yifang Chen*, Wendan Yan, Alex Fang, Wenjing Zhou, Kevin Jamieson, Simon S. Du
Conference on Neural Information Processing Systems (NeurIPS) 2024 (Spotlight)

Learning to Cooperate with Humans using Generative Agents
Yancheng Liang, Daphne Chen, Abhishek Gupta, Simon S. Du*, Natasha Jaques*
Conference on Neural Information Processing Systems (NeurIPS) 2024

Understanding the Gains from Repeated Self-Distillation
Divyansh Pareek, Simon S. Du, Sewoong Oh
Conference on Neural Information Processing Systems (NeurIPS) 2024

Decoding-Time Language Model Alignment with Multiple Objectives
Ruizhe Shi, Yifang Chen, Yushi Hu, Alisa Liu, Hannaneh Hajishirzi, Noah A. Smith, Simon S. Du
Conference on Neural Information Processing Systems (NeurIPS) 2024

Toward Global Convergence of Gradient EM for Over-Parameterized Gaussian Mixture Models
Weihang Xu, Maryam Fazel, Simon S. Du
Conference on Neural Information Processing Systems (NeurIPS) 2024

Transferable Reinforcement Learning via Generalized Occupancy Models
Chuning Zhu, Xinqi Wang, Tyler Han, Simon S. Du, Abhishek Gupta
Conference on Neural Information Processing Systems (NeurIPS) 2024

Learning Optimal Tax Design in Nonatomic Congestion Games
Qiwen Cui, Maryam Fazel, Simon S. Du
Conference on Neural Information Processing Systems (NeurIPS) 2024

Reflect-RL: Two-Player Online RL Fine-Tuning for LMs
Runlong Zhou, Simon S. Du, Beibin Li
Annual Meeting of the Association for Computational Linguistics (ACL) 2024

An Experimental Design Framework for Label-Efficient Supervised Finetuning of Large Language Models
Gantavya Bhatt*, Yifang Chen*, Arnav M. Das*, Jifan Zhang*, Sang T. Truong, Stephen Mussmann, Yinglun Zhu, Jeffrey Bilmes, Simon S. Du, Kevin Jamieson, Jordan T. Ash, Robert D. Nowak
Annual Meeting of the Association for Computational Linguistics (ACL) 2024, Findings

Settling the Sample Complexity of Online Reinforcement Learning
Zihan Zhang, Yuxin Chen, Jason D. Lee, Simon S. Du
Conference of Learning Theory (COLT) 2024

Optimal Multi-Distribution Learning
Zihan Zhang, Wenhao Zhan, Yuxin Chen, Simon S. Du, Jason D. Lee
Conference of Learning Theory (COLT) 2024

Refined Sample Complexity for Markov Games with Independent Linear Function Approximation
Yan Dai, Qiwen Cui, Simon S. Du
Conference of Learning Theory (COLT) 2024

Rethinking Transformers in Solving POMDPs
Chenhao Lu, Ruizhe Shi*, Yuyao Liu*, Kaizhe Hu, Simon S. Du, Huazhe Xu
International Conference on Machine Learning (ICML) 2024

How Over-Parameterization Slows Down Gradient Descent in Matrix Sensing: The Curses of Symmetry and Initialization
Nuoya Xiong, Lijun Ding, Simon S. Du
International Conference on Learning Representations (ICLR) 2024 (Spotlight)

Horizon-Free Regret for Linear Markov Decision Processes
Zihan Zhang, Jason D. Lee, Yuxin Chen, Simon S. Du
International Conference on Learning Representations (ICLR) 2024

Free from Bellman Completeness: Trajectory Stitching via Model-based Return-conditioned Supervised Learning
Zhaoyi Zhou, Chuning Zhu, Runlong Zhou, Qiwen Cui, Abhishek Gupta, Simon S. Du
International Conference on Learning Representations (ICLR) 2024
NeurIPS 2023 Foundation Models for Decision Making workshop (Oral)
[Website] [Video]

JoMA: Demystifying Multilayer Transformers via JOint Dynamics of MLP and Attention
Yuandong Tian, Yiping Wang, Zhenyu Zhang, Beidi Chen, Simon S. Du
International Conference on Learning Representations (ICLR) 2024

A Black-box Approach for Non-stationary Multi-agent Reinforcement Learning
Haozhe Jiang, Qiwen Cui, Zhihan Xiong, Maryam Fazel, Simon S. Du
International Conference on Learning Representations (ICLR) 2024

Dichotomy of Early and Late Phase Implicit Biases Can Provably Induce Grokking
Kaifeng Lyu*, Jikai Jin*, Zhiyuan Li, Simon Shaolei Du, Jason D. Lee, Wei Hu
International Conference on Learning Representations (ICLR) 2024

Unleashing the Power of Pre-trained Language Models for Offline Reinforcement Learning
Ruizhe Shi*, Yuyao Liu*, Yanjie Ze, Simon S. Du, Huazhe Xu
International Conference on Learning Representations (ICLR) 2024
[Website]

LabelBench: A Comprehensive Framework for Benchmarking Label-Efficient Learning
Jifan Zhang*, Yifang Chen*, Gregory Canal, Stephen Mussmann, Yinglun Zhu, Simon S. Du, Kevin Jamieson, Robert D Nowak
Journal of Data-centric Machine Learning Research (DMLR) 2024

2023

Understanding Curriculum Learning in Policy Optimization for Solving Combinatorial Optimization Problems
Runlong Zhou, Zelin He, Yuandong Tian, Yi Wu, Simon S. Du
Transactions on Machine Learning Research (TMLR) 2023

Active Representation Learning for General Task Space with Applications in Robotics
Yifang Chen, Yingbing Huang, Simon S. Du, Kevin Jamieson, Guanya Shi
Conference on Neural Information Processing Systems (NeurIPS) 2023

Scan and Snap: Understanding Training Dynamics and Token Composition in 1-layer Transformer
Yuandong Tian, Yiping Wang, Beidi Chen, Simon S. Du
Conference on Neural Information Processing Systems (NeurIPS) 2023
Selected as Oral presentation at High-dimensional learning dynamics workshop at ICML 2023

A Reduction-based Framework for Sequential Decision Making with Delayed Feedback
Yunchang Yang*, Han Zhong*, Tianhao Wu*, Bin Liu, Liwei Wang, Simon S. Du
Conference on Neural Information Processing Systems (NeurIPS) 2023

Extragradient-Based Algorithms for Stochastic Variational Inequalities with Separable Structure
Angela Yuan, Chris Junchi Li, Gauthier Gidel, Michael Jordan, Quanquan Gu, Simon S. Du
Conference on Neural Information Processing Systems (NeurIPS) 2023

Integrating Traffic Science with Representation Learning for City-wide Network Congestion Prediction
Wenqing Zheng*, Hao Yang*, Jiarui Cai, Peihao Wang, Xuan Jiang, Simon S. Du, Yinhai Wang, Zhangyang Wang
Information Fusion 2023

Over-Parameterization Exponentially Slows Down Gradient Descent for Learning a Single Neuron
Weihang Xu, Simon S. Du
Conference of Learning Theory (COLT) 2023
Selected as Oral presentation at High-dimensional learning dynamics workshop at ICML 2023

Breaking the Curse of Multiagents in a Large State Space: RL in Markov Games with Independent Linear Function Approximation
Qiwen Cui, Kaiqing Zhang, Simon S. Du
Conference of Learning Theory (COLT) 2023

On the Power of Pre-training for Generalization in RL: Provable Benefits and Hardness
Haotian Ye*, Xiaoyu Chen*, Liwei Wang, Simon S. Du
International Conference on Machine Learning (ICML) 2023 (Oral)

Sharp Variance-Dependent Bounds in Reinforcement Learning: Best of Both Worlds in Stochastic and Deterministic Environments
Runlong Zhou, Zihan Zhang, Simon S. Du
International Conference on Machine Learning (ICML) 2023

Understanding Incremental Learning of Gradient Descent: A Fine-grained Analysis of Matrix Sensing
Jikai Jin, Zhiyuan Li, Kaifeng Lyu, Simon S. Du, Jason D. Lee
International Conference on Machine Learning (ICML) 2023

Horizon-Free Reinforcement Learning for Latent Markov Decision Processes
Runlong Zhou, Ruosong Wang, Simon S. Du
International Conference on Machine Learning (ICML) 2023

Improved Active Multi-Task Representation Learning via Lasso
Yiping Wang, Yifang Chen, Kevin Jamieson, Simon S. Du
International Conference on Machine Learning (ICML) 2023

Offline congestion games: How feedback type affects data coverage requirement
Haozhe Jiang*, Qiwen Cui*, Zhihan Xiong, Maryam Fazel, Simon S. Du
International Conference on Learning Representations (ICLR) 2023

Faster Last-iterate Convergence of Policy Optimization in Zero-Sum Markov Games
Shicong Cen, Yuejie Chi, Simon S. Du, Lin Xiao
International Conference on Learning Representations (ICLR) 2023

Variance-Aware Sparse Linear Bandits
Yan Dai, Ruosong Wang, Simon S. Du
International Conference on Learning Representations (ICLR) 2023

Linear Convergence of Natural Policy Gradient Methods with Log-Linear Policies
Rui Yuan, Simon S. Du, Robert M. Gower, Alessandro Lazaric, Lin Xiao
International Conference on Learning Representations (ICLR) 2023

Blessing of Class Diversity in Pre-training
Yulai Zhao, Jianshu Chen, Simon S. Du
International Conference on Artificial Intelligence and Statistics (AISTATS) 2023 (Notable Paper)

Beyond Information Gain: An Empirical Benchmark for Low-Switching-Cost Reinforcement Learning
Shusheng Xu, Yancheng Liang, Yunfei Li, Simon S. Du, Yi Wu
Transactions on Machine Learning Research (TMLR) 2023

2022

Learning in Congestion Games with Bandit Feedback
Qiwen Cui*, Zhihan Xiong*, Maryam Fazel, Simon S. Du
Conference on Neural Information Processing Systems (NeurIPS) 2022

On Gap-dependent Bounds for Offline Reinforcement Learning
Xinqi Wang, Qiwen Cui, Simon S. Du
Conference on Neural Information Processing Systems (NeurIPS) 2022

Provably Efficient Offline Multi-agent Reinforcement Learning via Strategy-wise Bonus
Qiwen Cui, Simon S. Du
Conference on Neural Information Processing Systems (NeurIPS) 2022

Provable General Function Class Representation Learning in Multitask Bandits and MDPs
Rui Lu, Andrew Zhao, Simon S. Du, Gao Huang
Conference on Neural Information Processing Systems (NeurIPS) 2022 (Spotlight)

When is Offline Two-Player Zero-Sum Markov Game Solvable?
Qiwen Cui, Simon S. Du
Conference on Neural Information Processing Systems (NeurIPS) 2022

Near-Optimal Randomized Exploration for Tabular MDP
Zhihan Xiong*, Ruoqi Shen*, Qiwen Cui*, Maryam, Fazel, Simon S. Du
Conference on Neural Information Processing Systems (NeurIPS) 2022

Horizon-Free Reinforcement Learning in Polynomial Time: the Power of Stationary Policies
Zihan Zhang, Xiangyang Ji, Simon S. Du
Conference of Learning Theory (COLT) 2022

Active Multi-Task Representation Learning
Yifang Chen, Simon S. Du, Kevin Jamieson
International Conference on Machine Learning (ICML) 2022

Near-Optimal Algorithms for Autonomous Exploration and Multi-Goal Stochastic Shortest Path
Haoyuan Cai, Tengyu Ma, Simon S. Du
International Conference on Machine Learning (ICML) 2022

Denoised MDPs: Learning World Models Better Than the World Itself
Tongzhou Wang, Simon S. Du, Antonio Torralba, Phillip Isola, Amy Zhang, Yuandong Tian
International Conference on Machine Learning (ICML) 2022
[Project Page] [Code]

Reward-Free RL is No Harder Than Reward-Aware RL in Linear Markov Decision Processes
Andrew Wagenmaker, Yifang Chen, Max Simchowitz, Simon S. Du, Kevin Jamieson
International Conference on Machine Learning (ICML) 2022

Nearly Optimal Policy Optimization with Stable at Any Time Guarantee
Tianhao Wu*, Yunchang Yang*, Han Zhong*, Liwei Wang, Simon S. Du, Jiantao Jiao
International Conference on Machine Learning (ICML) 2022

First-Order Regret in Reinforcement Learning with Linear Function Approximation: A Robust Estimation Approach
Andrew Wagenmaker, Yifang Chen, Max Simchowitz, Simon S. Du, Kevin Jamieson
International Conference on Machine Learning (ICML) 2022 (Long talk)

Provable Adaptation across Multiway Domains via Representation Learning
Zhili Feng, Shaobo Han, Simon S. Du
International Conference on Learning Representations (ICLR) 2022

A Unified Framework for Conservative Exploration
Yunchang Yang*, Tianhao Wu*, Han Zhong*, Evrard Garcelon, Matteo Pirotta, Alessandro Lazaric, Liwei Wang, Simon S. Du
International Conference on Learning Representations (ICLR) 2022

Provably Efficient Policy Gradient Methods for Two-Player Zero-Sum Markov Games
Yulai Zhao, Yuandong Tian, Jason D. Lee, Simon S. Du
International Conference on Artificial Intelligence and Statistics (AISTATS) 2022

Gap-Dependent Bounds for Two-Player Markov Games
Zehao Dou, Zhuoran Yang, Zhaoran Wang, Simon S. Du
International Conference on Artificial Intelligence and Statistics (AISTATS) 2022

AdaLoss: A Computationally-Efficient and Provably Convergent Adaptive Gradient Method
Xiaoxia Wu, Yuege Xie, Simon S. Du, Rachel Ward
AAAI Conference on Artificial Intelligence (AAAI) 2022

Understanding the Acceleration Phenomenon via High-Resolution Differential Equations
Bin Shi, Simon S. Du, Michael I. Jordan, Weijie J. Su
Mathematical Programming Series A, 2022

2021

Corruption Robust Active Learning
Yifang Chen, Simon S. Du, Kevin Jamieson
Conference on Neural Information Processing Systems (NeurIPS) 2021

Global Convergence of Gradient Descent for Asymmetric Low-Rank Matrix Factorization
Tian Ye, Simon S. Du
Conference on Neural Information Processing Systems (NeurIPS) 2021

Improved Variance-Aware Confidence Sets for Linear Bandits and Linear Mixture MDP
Zihan Zhang*, Jiaqi Yang*, Xiangyang Ji, Simon S. Du
Conference on Neural Information Processing Systems (NeurIPS) 2021

Stochastic Shortest Path: Minimax, Parameter-Free and Towards Horizon-Free Regret
Jean Tarbouriech*, Runlong Zhou*, Simon S. Du, Matteo Pirotta, Michal Valko, Alessandro Lazaric
Conference on Neural Information Processing Systems (NeurIPS) 2021 (Spotlight)
[Talk]

Nearly Horizon-Free Offline Reinforcement Learning
Tongzheng Ren, Jialian Li, Bo Dai, Simon S. Du, Sujay Sanghavi
Conference on Neural Information Processing Systems (NeurIPS) 2021

Is Reinforcement Learning More Difficult Than Bandits? A Near-optimal Algorithm Escaping the Curse of Horizon
Zihan Zhang, Xiangyang Ji, Simon S. Du
Conference of Learning Theory (COLT) 2021
[Talk]

Fine-Grained Gap-Dependent Bounds for Tabular MDPs via Adaptive Multi-Step Bootstrap
Haike Xu, Tengyu Ma, Simon S. Du
Conference of Learning Theory (COLT) 2021

When is Particle Filtering Efficient for Planning in Partially Observed Linear Dynamical Systems?
Simon S. Du*, Wei Hu*, Zhiyuan Li*, Ruoqi Shen*, Zhao Song*, Jiajun Wu*
Conference on Uncertainty in Artificial Intelligence (UAI) 2021

Bilinear Classes: A Structural Framework for Provable Generalization in RL
Simon S. Du*, Sham M. Kakade*, Jason D. Lee*, Shachar Lovett*, Gaurav Mahajan*, Wen Sun*, Ruosong Wang*
International Conference on Machine Learning (ICML) 2021 (Long talk)
[Talk]

Nearly Minimax Optimal Reward-free Reinforcement Learning
Zihan Zhang, Simon S. Du, Xiangyang Ji
International Conference on Machine Learning (ICML) 2021 (Long talk)
[Talk]

Improved Corruption Robust Algorithms for Episodic Reinforcement Learning
Yifang Chen, Simon S. Du, Kevin Jamieson
International Conference on Machine Learning (ICML) 2021

On Reinforcement Learning with Adversarial Corruption and Its Application to Block MDP
Tianhao Wu*, Yunchang Yang*, Simon S. Du, Liwei Wang
International Conference on Machine Learning (ICML) 2021

Q-learning with Logarithmic Regret
Kunhe Yang, Lin F. Yang, Simon S. Du
International Conference on Artificial Intelligence and Statistics (AISTATS) 2021

Impact of Representation Learning in Linear Bandits
Jiaqi Yang, Wei Hu, Jason D. Lee, Simon S. Du
International Conference on Learning Representations (ICLR) 2021

Few-Shot Learning via Learning the Representation, Provably
Simon S. Du*, Wei Hu*, Sham M. Kakade*, Jason D. Lee*, Qi Lei*
International Conference on Learning Representations (ICLR) 2021

How Neural Networks Extrapolate: From Feedforward to Graph Neural Networks
Keyulu Xu, Jingling Li, Mozhi Zhang, Simon S. Du, Ken-ichi Kawarabayashi, Stefanie Jegelka
International Conference on Learning Representations (ICLR) 2021 (Oral)

Optimism in Reinforcement Learning with Generalized Linear Function Approximation
Yining Wang, Ruosong Wang, Simon S. Du, Akshay Krishnamurthy
International Conference on Learning Representations (ICLR) 2021

Discovering Diverse Multi-Agent Strategic Behavior via Reward Randomization
Zhenggang Tang, Chao Yu, Boyuan Chen, Huazhe Xu, Xiaolong Wang, Fei Fang, Simon S. Du, Yu Wang, Yi Wu
International Conference on Learning Representations (ICLR) 2021

2020

Is Long Horizon Reinforcement Learning More Difficult Than Short Horizon Reinforcement Learning?
Ruosong Wang*, Simon S. Du*, Lin F. Yang*, Sham M. Kakade
Conference on Neural Information Processing Systems (NeurIPS) 2020

Planning with General Objective Functions: Going Beyond Total Rewards
Ruosong Wang*, Peilin Zhong*, Simon S. Du, Ruslan Salakhutdinov, Lin F. Yang
Conference on Neural Information Processing Systems (NeurIPS) 2020

On Reward-Free Reinforcement Learning with Linear Function Approximation
Ruosong Wang, Simon S. Du, Lin F. Yang, Ruslan Salakhutdinov
Conference on Neural Information Processing Systems (NeurIPS) 2020

Provably Efficient Exploration for RL with Unsupervised Learning
Fei Feng, Ruosong Wang, Wotao Yin, Simon S. Du, Lin F. Yang
Conference on Neural Information Processing Systems (NeurIPS) 2020 (Spotlight)
[Talk]

Agnostic Q-learning with Function Approximation in Deterministic Systems: Tight Bounds on Approximation Error and Sample Complexity
Simon S. Du*, Jason D. Lee*, Gaurav Mahajan*, Ruosong Wang*
Conference on Neural Information Processing Systems (NeurIPS) 2020

Over-parameterized Adversarial Training: An Analysis Overcoming the Curse of Dimensionality
Yi Zhang*, Orestis Plevrakis*, Simon S. Du, Xingguo Li, Zhao Song, Sanjeev Arora
Conference on Neural Information Processing Systems (NeurIPS) 2020

Near-Linear Time Local Polynomial Nonparametric Estimation
Yining Wang, Yi Wu, Simon S. Du
Informs Journal on Computing 2020

Provable Representation Learning for Imitation Learning via Bi-level Optimization
Sanjeev Arora*, Simon S. Du*, Sham Kakade*, Yuping Luo*, Nikunj Saunshi*
International Conference on Machine Learning (ICML) 2020

Dual Sequential Monte Carlo: Tunneling Filtering and Planning in Continuous POMDPs
Yunbo Wang*, Bo Liu*, Jiajun Wu, Yuke Zhu, Simon S. Du, Li Fei-Fei, Joshua B. Tenenbaum
International Joint Conference on Artificial Intelligence (IJCAI) 2020

On Stationary-Point Hitting Time and Ergodicity of Stochastic Gradient Langevin Dynamics
Xi Chen*, Simon S. Du*, Xin T. Tong*
Jounrnal of Machine Learning Research (JMLR) 2020

Is a Good Representation Sufficient for Sample Efficient Reinforcement Learning?
Simon S. Du*, Sham M. Kakade*, Ruosong Wang*, Lin F. Yang*
International Conference on Learning Representations (ICLR) 2020 (Spotlight)
Selected as Late-Breaking Paper in NeurIPS 2019 Deep Reinforcement Learning Workshop

Harnessing the Power of Infinitely Wide Deep Nets on Small-data Tasks
Sanjeev Arora*, Simon S. Du*, Zhiyuan Li*, Ruslan Salakhutdinov*, Ruosong Wang*, Dingli Yu*
International Conference on Learning Representations (ICLR) 2020 (Spotlight)
[Code]

What Can Neural Networks Reason About?
Keyulu Xu, Jingling Li, Mozhi Zhang, Simon S. Du, Ken-ichi Kawarabayashi, Stefanie Jegelka
International Conference on Learning Representations (ICLR) 2020 (Spotlight)

2019

Gradient Descent for Non-convex Problems in Modern Machine Learning
Simon S. Du
PhD thesis, Machine Learning Department, Carnegie Mellon University
CMU SCS Dissertation Award Honorable Mention
Nominations for ACM Dissertation Award and AAAI SIGAI Dissertation Award

Provably Efficient Q-learning with Function Approximation via Distribution Shift Error Checking Oracle
Simon S. Du*, Yuping Luo*, Ruosong Wang*, Hanrui Zhang*
Conference on Neural Information Processing Systems (NeurIPS) 2019

Graph Neural Tangent Kernel: Fusing Graph Neural Networks with Graph Kernels
Simon S. Du*, Kangcheng Hou*, Barnabás Póczos*, Ruslan Salakhutdinov*, Ruosong Wang*, Keyulu Xu*
Conference on Neural Information Processing Systems (NeurIPS) 2019
[Code]

On Exact Computation with an Infinitely Wide Neural Net
Sanjeev Arora*, Simon S. Du*, Wei Hu*, Zhiyuan Li*, Ruslan Salakhutdinov*, Ruosong Wang*
Conference on Neural Information Processing Systems (NeurIPS) 2019 (Spotlight)
[Code] [Blog]

Acceleration via Symplectic Discretization of High-Resolution Differential Equations
Bin Shi, Simon S. Du, Weijie Su, Michael I. Jordan
Conference on Neural Information Processing Systems (NeurIPS) 2019

Towards Understanding the Importance of Shortcut Connections in Residual Networks
Tianyi Liu*, Minshuo Chen*, Mo Zhou, Simon S. Du, Enlu Zhou and Tuo Zhao
Conference on Neural Information Processing Systems (NeurIPS) 2019

Provably efficient RL with Rich Observations via Latent State Decoding
Simon S. Du, Akshay Krishnamurthy, Nan Jiang, Alekh Agarwal, Miroslav Dudík, John Langford
International Conference on Machine Learning (ICML) 2019
[Code] [Blog]

Fine-Grained Analysis of Optimization and Generalization for Overparameterized Two-Layer Neural Networks
Sanjeev Arora*, Simon S. Du*, Wei Hu*, Zhiyuan Li*, Ruosong Wang*
International Conference on Machine Learning (ICML) 2019

Width Provably Matters in Optimization for Deep Linear Neural Networks
Simon S. Du*, Wei Hu*
International Conference on Machine Learning (ICML) 2019

Gradient Descent Finds Global Minima of Deep Neural Networks
Simon S. Du*, Jason D. Lee*, Haochuan Li*, Liwei Wang*, Xiyu Zhai*
International Conference on Machine Learning (ICML) 2019

Gradient Descent Provably Optimizes Over-parameterized Neural Networks
Simon S. Du*, Xiyu Zhai*, Barnabás Póczos, Aarti Singh
International Conference on Learning Representations (ICLR) 2019

Linear Convergence of the Primal-Dual Gradient Method for Convex-Concave Saddle Point Problems without Strong Convexity
Simon S. Du*, Wei Hu*
International Conference on Artificial Intelligence and Statistics (AISTATS) 2019

2018

Algorithmic Regularization in Learning Deep Homogeneous Models: Layers are Automatically Balanced
Simon S. Du*, Wei Hu*, Jason D. Lee*
Conference on Neural Information Processing Systems (NeurIPS) 2018
ICML 2018 Workshop on Nonconvex Optimization Best Paper Award

How Many Samples are Needed to Learn a Convolutional Neural Network?
Simon S. Du*, Yining Wang*, Xiyu Zhai, Sivaraman Balakrishnan, Ruslan Salakhutdinov, Aarti Singh
Conference on Neural Information Processing Systems (NeurIPS) 2018
[Forbes Article] NVIDIA Pioneer Award

Gradient Descent Learns One-hidden-layer CNN: Don't be Afraid of Spurious Local Minima
Simon S. Du, Jason D. Lee, Yuandong Tian, Barnabás Póczos, Aarti Singh
International Conference on Machine Learning (ICML) 2018 (Long Talk)

On the Power of Over-parametrization in Neural Networks with Quadratic Activation
Simon S. Du, Jason D. Lee
International Conference on Machine Learning (ICML) 2018.

Fast and Sample Efficient Inductive Matrix Completion via Multi-Phase Procrustes Flow
Xiao Zhang*, Simon S. Du*, Quanquan Gu
International Conference on Machine Learning (ICML) 2018

Discrete-Continuous Mixtures in Probabilistic Programming: Generalized Semantics and Inference Algorithms
Yi Wu, Siddharth Srivastava, Nick Hay, Simon S. Du, Stuart Russell
International Conference on Machine Learning (ICML) 2018

When is a Convolutional Filter Easy to Learn?
Simon S. Du, Jason D. Lee, Yuandong Tian
International Conference on Learning Representations (ICLR) 2018

Stochastic Zeroth-order Optimization in High Dimensions
Yining Wang, Simon S. Du, Sivaraman Balakrishnan, Aarti Singh
International Conference on Artificial Intelligence and Statistics (AISTATS) 2018 (Oral)

2017

Gradient Descent Can Take Exponential Time to Escape Saddle Points
Simon S. Du, Chi Jin, Jason D. Lee, Michael I. Jordan, Barnabás Póczos, Aarti Singh
Conference on Neural Information Processing Systems (NIPS) 2017 (Spotlight)

On the Power of Truncated SVD for General High-rank Matrix Estimation Problems
Simon S. Du, Yining Wang, Aarti Singh
Conference on Neural Information Processing Systems (NIPS) 2017

Hypothesis Transfer Learning via Transformation Functions
Simon S. Du, Jayanth Koushik, Aarti Singh, Barnabás Póczos
Conference on Neural Information Processing Systems (NIPS) 2017

High-throughput Robotic Phenotyping of Energy Sorghum Crops
Srinivasan Vijayarangan, Paloma Sodhi, Prathamesh Kini, James Bourne, Simon S. Du, Hanqi Sun, Barnabás Póczos, Dimitrios Apostolopoulos, and David Wettergreen
Conference on Field and Service Robotics (FSR) 2017

Stochastic Variance Reduction Methods for Policy Evaluation
Simon S. Du, Jianshu Chen, Lihong Li, Lin Xiao, Dengyong Zhou
International Conference on Machine Learning (ICML) 2017

Computationally Efficient Robust Estimation of Sparse Functionals
Simon S. Du, Sivaraman Balakrishnan, Aarti Singh
Conference of Learning Theory (COLT) 2017
Merged with this paper

2016

Efficient Nonparametric Smoothness Estimation
Shashank Singh, Simon S. Du Barnabás Póczos
Conference on Neural Information Processing Systems (NIPS) 2016

An Improved Gap-Dependency Analysis of the Noisy Power Method
Maria-Florina Balcan*, Simon S. Du*, Yining Wang*, Adams Wei Yu*
Conference of Learning Theory (COLT) 2016

2015

Spectral Gap Error Bounds for Improving CUR Matrix Decomposition and the Nystrom Method
David G. Anderson*, Simon S. Du*, Michael W. Mahoney*, Christopher Melgaard*, Kunming Wu*, Ming Gu*
International Conference on Artificial Intelligence and Statistics (AISTATS) 2015

Preprints and Technical Reports

Reinforcement Learning for Reasoning in Large Language Models with One Training Example
Yiping Wang, Qing Yang, Zhiyuan Zeng, Liliang Ren, Lucas Liu, Baolin Peng, Hao Cheng, Xuehai He, Kuan Wang, Jianfeng Gao, Weizhu Chen, Shuohang Wang*, Simon S. Du*, Yelong Shen*

Improving Human-AI Coordination through Adversarial Training and Generative Models
Paresh Chaudhary, Yancheng Liang, Daphne Chen, Simon S. Du, Natasha Jaques

SHARP: Accelerating Language Model Inference by SHaring Adjacent layers with Recovery Parameters
Yiping Wang, Hanxian Huang, Yifang Chen, Jishen Zhao, Simon Shaolei Du, Yuandong Tian

Exploring How Generative MLLMs Perceive More Than CLIP with the Same Vision Encoder
Siting Li, Pang Wei Koh, Simon S. Du

Transformers are Efficient Compilers, Provably
Xiyu Zhai, Runlong Zhou, Liao Zhang, Simon Shaolei Du

Hybrid Preference Optimization for Alignment: Provably Faster Convergence Rates by Combining Oﬄine Preferences with Online Exploration
Avinandan Bose, Zhihan Xiong, Aadirupa Saha, Simon Shaolei Du, Maryam Fazel

Rethinking Data Synthesis: A Teacher Model Training Recipe with Interpretation
Yifang Chen, David Zhu, Simon Du, Kevin Jamieson, Yang Liu

Preference-Based Multi-Agent Reinforcement Learning: Data Coverage and Algorithmic Techniques
Natalia Zhang*, Xinqi Wang*, Qiwen Cui*, Runlong Zhou, Sham M. Kakade, Simon S. Du

Cost-Eﬀective Proxy Reward Model Construction with On-Policy and Active Learning
Yifang Chen, Shuohang Wang, Ziyi Yang, Hiteshi Sharma, Nikos Karampatziakis, Donghan Yu, Kevin Jamieson, Simon Shaolei Du, Yelong Shen

TransFollower: Long-Sequence Car-Following Trajectory Prediction through Transformer
Meixin Zhu, Simon S. Du, Xuesong Wang, Hao, Yang, Ziyuan Pu, Yinhai Wang

A Provably Eﬃcient Algorithm for Linear Markov Decision Process with Low Switching Cost
Minbo Gao*, Tianle Xie*, Simon S. Du, Lin F. Yang

Towards Demystifying Representation Learning with Non-contrastive Self-supervision
Xiang Wang, Xinlei Chen, Simon S. Du, Yuandong Tian

Enhanced Convolutional Neural Tangent Kernels
Zhiyuan Li*, Ruosong Wang*, Dingli Yu*, Simon S. Du, Wei Hu, Ruslan Salakhutdinov, Sanjeev Arora

Continuous Control with Contexts, Provably
Simon S. Du*, Ruosong Wang*, Mengdi Wang*, Lin F. Yang*

Improved Learning of One-hidden-layer Convolutional Neural Networks with Overlaps
Simon S. Du*, Surbhi Goel*.

Robust Nonparametric Regression under Huber's epsilon-contamination Model
Simon S. Du, Yining Wang, Sivaraman Balakrishnan, Pradeep Ravikumar, Aarti Singh