Ziqi Wang (王梓齐)
I am a second-year master student at School of Artificial Intelligence and Data Science, University of Science and Technology of China, advised by Tong Xu and Enhong Chen.
I'm currently an research intern at Baidu (ERNIE Star), working under the supervision of Jing Liu and Haifeng Wang, focusing on cutting-edge research in LLM reasoning.
I regularly share technical insights on Rednote—follow me for updates!
Previously, I interned at StepFun as a core contributor to Step2-mini (under Houyi Li and Xiangyu Zhang) and deeply engaging in Step3, where I gained comprehensive LLM expertise across the full development cycle: data engineering, pretraining (major), long-context, post-training and reasoning systems.
This role also gave me hands-on experience with large-scale distributed training across thousands of GPUs.
My research interests span the full LLM stack😼(pretraining, mid-training(agent), reasoning and training dynamics).
I am looking for full-time opportunities in LLM foundational model teams at top tech companies, with a focus on LLM pretraining and reasoning now.
Feel free to contact me at anytime!
Email /
Scholar /
Weixin /
Rednote
|
|
Projects / Research
I'm interested in LLM pre-training, reasoning, long-context, multimodal, information extrcation, image generation model.
|
|
Step2-mini
Stepfun Team (Core contributor)
A lightweight LLM designed with innovative attention mechanisms for fast online response.
|
|
Step3: Cost-Effective Multimodal Intelligence
Stepfun Team (Core contributor)
A cutting-edge multimodal reasoning model—built on a Mixture-of-Experts architecture with 321B total parameters and 38B active.
|
|
Step-3 is Large yet Affordable: Model-system Co-design for Cost-effective Decoding
Stepfun Team
Model-system co-design of Step-3, specifically engineered for the test-time scaling paradigm with the primary optimization objective of minimizing decoding costs.
|
|
Granular Entity Mapper: Advancing Fine-grained Multimodal Named Entity Recognition and Grounding
Ziqi Wang, Chen Zhu, Zhi Zheng, Xinhang Li, Tong Xu, Yongyi He, Qi Liu, Ying Yu, Enhong Chen
Findings of the Association for Computational Linguistics: EMNLP 2024, 2024
A novel method for fine-grained text entity recognition with simultaneous localization of corresponding image entities.
|
|
Can Mixture-of-Experts Surpass Dense LLMs Under Strictly Equal Resource?
Houyi Li, Ka Man Lo, Ziqi Wang, Zili Wang, Wenzhen Zheng, Shuigeng Zhou, Xiangyu Zhang, Daxin Jiang
Are there sparse regions where MoE (Mixture of Experts) surpasses dense models under the same computational budget?
|
|
Beyond the Known: An Unknown-Aware Large Language Model for Open-Set Text Classification
Xi Chen, Chuan Qin, Ziqi Wang, Shasha Hu, Chao Wang, Hengshu Zhu, Hui Xiong
A novel method for Open-Set Text Classification using LLM uncertainty estimation and mitigation
|
|
Is Compression Really Linear with Code Intelligence?
Shijie Xuyang, Xianzhen Luo, Tianhao Cheng, Zheng Chu, Houyi Li, Ziqi Wang, Siming Huang, Qingfu Zhu, Qiufeng Wang, Xiangyu Zhang, Shuigeng Zhou, Wanxiang Che
Accurately measuring code intelligence between base models and chat models is critical.
|
|
Cat-gnn: Enhancing credit card fraud detection via causal temporal graph neural networks
Yifan Duan, Guibin Zhang, Shilong Wang, Xiaojiang Peng, Ziqi Wang, Junyuan Mao, Hao Wu, Xinke Jiang, Kun Wang
A novel method for credit card fraud detection via causal temporal graph neural networks.
|
|
Baidu | Ernie Team
Research Intern (Ernie Star Top Talent Program)
April 2025 - Now
Supervised by Jing Liu , Haifeng Wang
Happy to join Ernie in exploring the reasoning mechanisms of large language models!
|
|
Stepfun | Foundation Team
Pretraining Algorithm Intern
August 2024 - April 2025
Supervised by Houyi Li , Xiangyu Zhang
Hands-on full-stack LLM training contains: data engine, pretraining, post-training, long-context reasoning, large-scale distributed training!
|
|
Iflytek | Core R&D Platform
Algorithm Intern
December 2023 - March 2024
Supervised by Haochen Jiang , Shan He
Some interesting experiments on conditional image generation for real-world online applications!
|
|
Boss Zhipin | CSL Lab
Research Intern
April 2023 - August 2023
Supervised by Chen Zhu , Hengshu Zhu
Beginning my research journey with a focus on robustness in Text-to-Image Diffusion Models. Grateful for the cultivation of my research insights!
|
Selected Awards
- First Prize Scholarship, University of Science and Technology of China 2023
- Outstanding Graduates of University of Science and Technology of China 2023
- Outstanding Graduation Thesis of University of Science and Technology of China 2023
- First Prize of the Chinese Mathematics Competitions (Anhui) 2022
- Scholarship of China National Petroleum Corporation 2022 (3%)
- Outstanding Student Scholarship (Silver), University of Science and Technology of China 2021
- Outstanding Student Scholarship (Bronze), University of Science and Technology of China 2020
|
|