Ziqi Wang

Ziqi Wang (王梓齐)

I am a second-year master student at School of Artificial Intelligence and Data Science, University of Science and Technology of China, advised by Tong Xu and Enhong Chen.

I'm currently an research intern at Baidu (ERNIE Star), working under the supervision of Jing Liu and Haifeng Wang, focusing on cutting-edge research in LLM reasoning. I regularly share technical insights on Rednote—follow me for updates!

Previously, I interned at StepFun as a core contributor to Step2-mini (under Houyi Li and Xiangyu Zhang) and deeply engaging in Step3, where I gained comprehensive LLM expertise across the full development cycle: data engineering, pretraining (major), long-context, post-training and reasoning systems. This role also gave me hands-on experience with large-scale distributed training across thousands of GPUs.

My research interests span the full LLM stack😼(pretraining, mid-training(agent), reasoning and training dynamics).
I am looking for full-time opportunities in LLM foundational model teams at top tech companies, with a focus on LLM pretraining and reasoning now.

Feel free to contact me at anytime!

Email / Scholar / Weixin / Rednote

Projects / Research

I'm interested in LLM pre-training, reasoning, long-context, multimodal, information extrcation, image generation model.

	Step2-mini Stepfun Team (Core contributor) A lightweight LLM designed with innovative attention mechanisms for fast online response.
	Step3: Cost-Effective Multimodal Intelligence Stepfun Team (Core contributor) A cutting-edge multimodal reasoning model—built on a Mixture-of-Experts architecture with 321B total parameters and 38B active.
	Step-3 is Large yet Affordable: Model-system Co-design for Cost-effective Decoding Stepfun Team Model-system co-design of Step-3, specifically engineered for the test-time scaling paradigm with the primary optimization objective of minimizing decoding costs.
	Granular Entity Mapper: Advancing Fine-grained Multimodal Named Entity Recognition and Grounding Ziqi Wang, Chen Zhu, Zhi Zheng, Xinhang Li, Tong Xu, Yongyi He, Qi Liu, Ying Yu, Enhong Chen Findings of the Association for Computational Linguistics: EMNLP 2024, 2024 A novel method for fine-grained text entity recognition with simultaneous localization of corresponding image entities.
	Can Mixture-of-Experts Surpass Dense LLMs Under Strictly Equal Resource? Houyi Li, Ka Man Lo, Ziqi Wang, Zili Wang, Wenzhen Zheng, Shuigeng Zhou, Xiangyu Zhang, Daxin Jiang Are there sparse regions where MoE (Mixture of Experts) surpasses dense models under the same computational budget?
	Beyond the Known: An Unknown-Aware Large Language Model for Open-Set Text Classification Xi Chen, Chuan Qin, Ziqi Wang, Shasha Hu, Chao Wang, Hengshu Zhu, Hui Xiong A novel method for Open-Set Text Classification using LLM uncertainty estimation and mitigation
	Is Compression Really Linear with Code Intelligence? Shijie Xuyang, Xianzhen Luo, Tianhao Cheng, Zheng Chu, Houyi Li, Ziqi Wang, Siming Huang, Qingfu Zhu, Qiufeng Wang, Xiangyu Zhang, Shuigeng Zhou, Wanxiang Che Accurately measuring code intelligence between base models and chat models is critical.
	Cat-gnn: Enhancing credit card fraud detection via causal temporal graph neural networks Yifan Duan, Guibin Zhang, Shilong Wang, Xiaojiang Peng, Ziqi Wang, Junyuan Mao, Hao Wu, Xinke Jiang, Kun Wang A novel method for credit card fraud detection via causal temporal graph neural networks.

Internship Experiences

	Baidu \| Ernie Team Research Intern (Ernie Star Top Talent Program) April 2025 - Now Supervised by Jing Liu , Haifeng Wang Happy to join Ernie in exploring the reasoning mechanisms of large language models!
	Stepfun \| Foundation Team Pretraining Algorithm Intern August 2024 - April 2025 Supervised by Houyi Li , Xiangyu Zhang Hands-on full-stack LLM training contains: data engine, pretraining, post-training, long-context reasoning, large-scale distributed training!
	Iflytek \| Core R&D Platform Algorithm Intern December 2023 - March 2024 Supervised by Haochen Jiang , Shan He Some interesting experiments on conditional image generation for real-world online applications!
	Boss Zhipin \| CSL Lab Research Intern April 2023 - August 2023 Supervised by Chen Zhu , Hengshu Zhu Beginning my research journey with a focus on robustness in Text-to-Image Diffusion Models. Grateful for the cultivation of my research insights!

Educational Experiences

	University of Science and Technology of China September 2023 - Now Master in Artificial Intelligence and Data Science Anhui Province Key Laboratory of Big Data Analysis and Application (BDAA)
	University of Science and Technology of China September 2019 - July 2023 Bachelor of Artificial Intelligence and Data Science

Selected Awards

First Prize Scholarship, University of Science and Technology of China 2023
Outstanding Graduates of University of Science and Technology of China 2023
Outstanding Graduation Thesis of University of Science and Technology of China 2023
First Prize of the Chinese Mathematics Competitions (Anhui) 2022
Scholarship of China National Petroleum Corporation 2022 (3%)
Outstanding Student Scholarship (Silver), University of Science and Technology of China 2021
Outstanding Student Scholarship (Bronze), University of Science and Technology of China 2020

Source

Awesome-Parallel-Reasoning