About me

I am a 4th-year Ph.D. student in the Conversational AI Group at the Department of Computer Science and Technology, Tsinghua University, under the supervision of Prof. Minlie Huang. Previously, I was an intern at Microsoft Research Asia, where I was mentored by Dr. Li Dong. My research interests focus on developing efficient methods for the entire life cycle of language models, including pre-training, downstream adaptation, and inference.

Education

  • 2021.9 - Present: Ph.D. Student, Department of Computer Science and Technology, Tsinghua University
  • 2017.9 - 2021.6: B.Eng., Department of Computer Science and Technology, Tsinghua University

Publications

Conference Papers

  • Yuxian Gu, Li Dong, Furu Wei, Minlie Huang. MiniLLM: Knowledge Distillation of Large Language Models. ICLR 2024. [pdf] [code] [huggingface]

  • Yuxian Gu, Li Dong, Furu Wei, Minlie Huang. Pre-Training to Learn in Context. ACL 2023 (Main, Long Paper, Oral). [pdf] [code]

  • Yuxian Gu, Pei Ke, Xiaoyan Zhu, Minlie Huang. Learning Instructions with Unlabeled Data for Zero-Shot Cross-Task Generalization. EMNLP 2022 (Main, Long Paper, Oral). [pdf] [code]

  • Yuxian Gu*, Xu Han*, Zhiyuan Liu, Minlie Huang. PPT: Pre-Trained Prompt Tuning for Few-Shot Learning. ACL 2022 (Main, Long Paper). [pdf] [code]

  • Daixuan Cheng, Yuxian Gu, Shaohan Huang, Junyu Bi, Minlie Huang, Furu Wei. Instruction Pre-Training: Language Models are Supervised Multitask Learners. EMNLP 2024 (Main, Long Paper). [pdf] [code] [huggingface]

  • Qi Zhu, Yuxian Gu, Lingxiao Luo, Bing Li, Cheng Li, Wei Peng, Minlie Huang, Xiaoyan Zhu. When does Further Pre-Training MLM Help? An Empirical Study on Task-Oriented Dialog Pre-Training. EMNLP 2021 Workshop (Best Paper). [pdf] [code]

  • Yuxian Gu, Zhengyan Zhang, Xiaozhi Wang, Zhiyuan Liu, Maosong Sun. Train No Evil: Selective Masking for Task-Guided Pre-Training. EMNLP 2020 (Short Paper). [pdf] [code]

  • Xin Lv, Yuxian Gu, Xu Han, Lei Hou, Juanzi Li, Zhiyuan Liu. Adapting Meta Knowledge Graph Information for Multi-Hop Reasoning over Few-Shot Relations. EMNLP 2019 (Short Paper). [pdf] [code]

Journal Papers

  • Yuxian Gu*, Jiaxin Wen*, Hao Sun*, Yi Song, Pei Ke, Chujie Zheng, Zheng Zhang, Jianzhu Yao, Lei Liu, Xiaoyan Zhu, Minlie Huang. EVA2.0: Investigating Open-Domain Chinese Dialogue Systems with Large-Scale Pre-Training. 2022. Machine Intelligence Research [pdf] [code]

  • Zhengyan Zhang*, Yuxian Gu*, Xu Han*, Shengqi Chen*, Chaojun Xiao*, Zhenbo Sun, Yuan Yao, Fanchao Qi, Jian Guan, Pei Ke, Yanzheng Cai, Guoyang Zeng, Zhixing Tan, Zhiyuan Liu, Minlie Huang, Wentao Han, Yang Liu, Xiaoyan Zhu, Maosong Sun. CPM-2: Large-Scale Cost-Effective Pre-Trained Language Models. 2022. AI Open. [pdf] [pre-train code] [fine-tune code]

  • Xu Han*, Zhengyan Zhang*, Ning Ding*, Yuxian Gu*, Xiao Liu*, Yuqi Huo*, Jiezhong Qiu, Yuan Yao, Ao Zhang, Liang Zhang, Wentao Han, Minlie Huang, Qin Jin, Yanyan Lan, Yang Liu, Zhiyuan Liu, Zhiwu Lu, Xipeng Qiu, Ruihua Song, Jie Tang, Ji-Rong Wen, Jinhui Yuan, Wayne Xin Zhao, Jun Zhu. Pre-Trained Models: Past, Present and Future. 2022. AI Open. [pdf]

  • Zhengyan Zhang, Xu Han, Hao Zhou, Pei Ke, Yuxian Gu, Deming Ye, Yujia Qin, Yusheng Su, Haozhe Ji, Jian Guan, Fanchao Qi, Xiaozhi Wang, Yanan Zheng, Guoyang Zeng, Huanqi Cao, Shengqi Chen, Daixuan Li, Zhenbo Sun, Zhiyuan Liu, Minlie Huang, Wentao Han, Jie Tang, Juanzi Li, Xiaoyan Zhu, Maosong Sun. CPM: A Large-Scale Generative Chinese Pre-Trained Language Model. 2021. AI Open. [pdf] [pre-train code] [fine-tune code] [inference code]

Preprints

  • Yuxian Gu, Hao Zhou, Fandong Meng, Jie Zhou, Minlie Huang. MiniPLM: Knowledge Distillation for Pre-Training Language Models. arXiv preprint 2024. [pdf] [code] [huggingface]

  • Yuxian Gu, Li Dong, Yaru Hao, Qingxiu Dong, Minlie Huang, Furu Wei. Data Selection via Optimal Control for Language Models. arXiv preprint 2024. [pdf] [code] [huggingface]

  • Yuxian Gu, Li Dong, Yaru Hao, Qingxiu Dong, Minlie Huang, Furu Wei. Towards Optimal Learning of Language Models. arXiv preprint 2024. [pdf] [code]

  • Yixing Li, Yuxian Gu, Li Dong, Dequan Wang, Yu Cheng, Furu Wei. Direct Preference Knowledge Distillation for Large Language Models. arXiv preprint 2024. [pdf] [code]

  • Yaru Hao, Yutao Sun, Li Dong, Zhixiong Han, Yuxian Gu, Furu Wei. Structured prompting: Scaling in-context learning to 1,000 examples. [pdf] [code]

Services

  • Program Committee Member (Conference Reviewer) EMNLP 2022-2023, ACL 2023-2024, ARR 2023-2024, NeurIPS 2024

Teaching

I was a TA for the following undergraduate courses:

  • Artificial Neural Network (2020 Fall, 2021 Fall, 2022 Fall, 2023 Fall)
  • Object-Oriented Programming (2021 Spring, 2022 Spring, 2023 Spring, 2024 Spring)

Selected Honors and Awards

  • Excellent Graduate, Tsinghua University, 2021
  • Outstanding Graduate, Dept. CST, Tsinghua University, 2021
  • Outstanding Undergraduate Dissertation, Tsinghua University, 2021
  • Overall Scholarship, Dept. CST, Tsinghua University, 2020
  • Science and Technology Innovation Excellence Scholarship, Dept. CST, Tsinghua University, 2019
  • Silver Prize, Asia Student Supercomputer Challenge, Asia Supercomputer Community, 2019
  • Overall Scholarship, Dept. CST, Tsinghua University, 2018

Miscellaneous

I like to play Jianzi, Jianqiu, a competitive sport similar to volleyball. I am the captain of Tsinghua Jianqiu Team and was the captain of the Beijing Youth Jianqiu Team. I once won the championship of the Jianqiu Competition in Beijing and represented Beijing in National Youth Olympic Games.