About Me

Hi! I am a Ph.D. student at Gaoling School of Artificial Intelligence, Renmin University of China, where I am advised by Prof. Yankai Lin. Before that, I got my Master’s degree at the Center for Data Science of Peking University. I was a member of LANCO group, where I was advised by Prof. Xu Sun. I am interested in Machine Learning (ML) and Natural Language Processing (NLP). Specifically, I am working on (1) alignment and security problems of Large Language Models; (2) Large Language Model reasoning.

Education

Ph.D. student in Gaoling School of Artificial Intelligence, Renmin University of China, Sept. 2023 - Now.
Master student in Data Science (Statistics), Peking University, Sept. 2020 - July 2023.
Visiting student in Statistics, University of California, Berkeley, Jan. 2019 - May 2019.
Bachelor in Mathematics and Applied Mathematics (Honors Program), Xi’an Jiaotong University, Sept. 2016 - July 2020.

Internship

Research Intern in Hunyuan Team, Tencent Inc., July 2025 - Now.
Research Intern in Microsoft Research, Asia, Sept. 2024 - May 2025.
Research Intern in Wechat AI, Tencent Inc., Jan. 2021 - Aug. 2024.

Preprints

(# denotes Equal Contribution)

Learning beyond Teacher: Generalized On-Policy Distillation with Reward Extrapolation
Wenkai Yang, Weijie Liu, Ruobing Xie, Kai Yang, Saiyong Yang, Yankai Lin
[arxiv, code]
DeepCritic: Deliberate Critique with Large Language Models
Wenkai Yang#, Jingwen Chen#, Yankai Lin, Ji-Rong Wen
[arxiv, code]

Selected Publications

Full List

(# denotes Equal Contribution)

LaSeR: Reinforcement Learning with Last-Token Self-Rewarding
Wenkai Yang, Weijie Liu, Ruobing Xie, Yiju Guo, Lulu Wu, Saiyong Yang, Yankai Lin
ICLR 2026 [arxiv, code]
Towards Thinking-Optimal Scaling of Test-Time Compute for LLM Reasoning
Wenkai Yang, Shuming Ma, Yankai Lin, Furu Wei
NeurIPS 2025 [arxiv, code]
Revisiting Weak-to-Strong Generalization in Theory and Practice: Reverse KL vs. Forward KL
Wei Yao#, Wenkai Yang#, Ziqiao Wang, Yankai Lin, Yong Liu
Findings of ACL 2025 [arxiv，code]
Super(ficial)-alignment: Strong Models May Deceive Weak Models in Weak-to-Strong Generalization
Wenkai Yang, Shiqi Shen, Guangyao Shen, Wei Yao, Yong Liu, Zhi Gong, Yankai Lin, Ji-Rong Wen
ICLR 2025 [arxiv, code]
Distilling Rule-based Knowledge into Large Language Models
Wenkai Yang, Yankai Lin, Jie Zhou, Ji-Rong Wen
COLING 2025 [arxiv, code]
Exploring Backdoor Vulnerabilities of Chat Models
Wenkai Yang#, Yunzhuo Hao#, Yankai Lin
COLING 2025 [arxiv, code]
Watch Out for Your Agents! Investigating Backdoor Threats to LLM-Based Agents
Wenkai Yang#, Xiaohan Bi#, Yankai Lin, Sishuo Chen, Jie Zhou, Xu Sun
NeurIPS 2024 [url, arxiv, code]
Decentralized Decoupled Training for Federated Long-Tailed Learning
Wenkai Yang, Deli Chen, Hao Zhou, Fandong Meng, Jie Zhou, Xu Sun
Transactions on Machine Learning Research [url, arxiv, code]
Towards Codable Text Watermarking for Large Language Models
Lean Wang#, Wenkai Yang#, Deli Chen#, Hao Zhou, Yankai Lin, Fandong Meng, Jie Zhou, Xu Sun
ICLR 2024 [url, arxiv, code]
When to Trust Aggregated Gradients: Addressing Negative Client Sampling in Federated Learning
Wenkai Yang, Yankai Lin, Guangxiang Zhao, Peng Li, Jie Zhou, Xu Sun
Transactions on Machine Learning Research [url, arxiv, code]
RAP: Robustness-Aware Perturbations for Defending against Backdoor Attacks on NLP Models
Wenkai Yang, Yankai Lin, Peng Li, Jie Zhou, Xu Sun
EMNLP 2021 [url, arxiv, code]
Rethinking Stealthiness of Backdoor Attack against NLP Models
Wenkai Yang, Yankai Lin, Peng Li, Jie Zhou, Xu Sun
ACL 2021 [url, code]
Be Careful about Poisoned Word Embeddings: Exploring the Vulnerability of the Embedding Layers in NLP Models
Wenkai Yang, Lei Li, Zhiyuan Zhang, Xuancheng Ren, Xu Sun, Bin He
NAACL-HLT 2021 [url, arxiv, code]

Awards

Excellent Graduate of Beijing Ordinary Colleges and Universities, 2022-2023
Excellent Graduate of Peking University, 2022-2023
National Scholarship of China (The highest scholarship for graduate students), 2020-2021
Pacemaker to Merit Student (The highest honor for graduate students), 2020-2021
Xingye Bank Scholarship, 2021-2022
Merit Student of PKU, 2021-2022

Contact

Email: kevenyang98 (at) gmail (dot) com

Wenkai Yang