About Me
Hi! I am a Ph.D. student at Gaoling School of Artificial Intelligence, Renmin University of China, where I am advised by Prof. Yankai Lin. Before that, I got my Master’s degree at the Center for Data Science of Peking University. I was a member of LANCO group, where I was advised by Prof. Xu Sun. I am interested in Machine Learning (ML) and Natural Language Processing (NLP). Specifically, I am working on the alignment and security problems of Large Language Models.
Education
- Ph.D. student in Gaoling School of Artificial Intelligence, Renmin University of China, Sept. 2023 - Now.
- Master student in Data Science (Statistics), Peking University, Sept. 2020 - July 2023.
- Visiting student in Statistics, University of California, Berkeley, Jan. 2019 - May 2019.
- Bachelor in Mathematics and Applied Mathematics (Honors Program), Xi’an Jiaotong University, Sept. 2016 - July 2020.
Internship
- Research Intern of Wechat AI, Tencent Inc., Jan. 2021 - Now.
Preprints
(# denotes Equal Contribution)
Super(ficial)-alignment: Strong Models May Deceive Weak Models in Weak-to-Strong Generalization
Wenkai Yang, Shiqi Shen, Guangyao Shen, Zhi Gong, Yankai Lin
[arxiv, code]Exploring Backdoor Vulnerabilities of Chat Models
Yunzhuo Hao#, Wenkai Yang#, Yankai Lin
[arxiv, code]Enabling Large Language Models to Learn from Rules
Wenkai Yang, Yankai Lin, Jie Zhou, Jirong Wen
[arxiv]
Selected Publications
(# denotes Equal Contribution)
Watch Out for Your Agents! Investigating Backdoor Threats to LLM-Based Agents
Wenkai Yang#, Xiaohan Bi#, Yankai Lin, Sishuo Chen, Jie Zhou, Xu Sun
NeurIPS 2024 [arxiv, code]Decentralized Decoupled Training for Federated Long-Tailed Learning
Wenkai Yang, Deli Chen, Hao Zhou, Fandong Meng, Jie Zhou, Xu Sun
Transactions on Machine Learning Research [url, arxiv, code]Towards Codable Text Watermarking for Large Language Models
Lean Wang#, Wenkai Yang#, Deli Chen#, Hao Zhou, Yankai Lin, Fandong Meng, Jie Zhou, Xu Sun
ICLR 2024 [url, arxiv, code]When to Trust Aggregated Gradients: Addressing Negative Client Sampling in Federated Learning
Wenkai Yang, Yankai Lin, Guangxiang Zhao, Peng Li, Jie Zhou, Xu Sun
Transactions on Machine Learning Research [url, arxiv, code]RAP: Robustness-Aware Perturbations for Defending against Backdoor Attacks on NLP Models
Wenkai Yang, Yankai Lin, Peng Li, Jie Zhou, Xu Sun
EMNLP 2021 [url, arxiv, code]Rethinking Stealthiness of Backdoor Attack against NLP Models
Wenkai Yang, Yankai Lin, Peng Li, Jie Zhou, Xu Sun
ACL 2021 [url, code]Be Careful about Poisoned Word Embeddings: Exploring the Vulnerability of the Embedding Layers in NLP Models
Wenkai Yang, Lei Li, Zhiyuan Zhang, Xuancheng Ren, Xu Sun, Bin He
NAACL-HLT 2021 [url, arxiv, code]
Awards
- Excellent Graduate of Beijing Ordinary Colleges and Universities, 2022-2023
- Excellent Graduate of Peking University, 2022-2023
- National Scholarship of China (The highest scholarship for graduate students), 2020-2021
- Pacemaker to Merit Student (The highest honor for graduate students), 2020-2021
- Xingye Bank Scholarship, 2021-2022
- Merit Student of PKU, 2021-2022
Contact
Email: kevenyang98 (at) gmail (dot) com