Welcome to Zhiwei Xu’s Homepage!

I am currently an assistant professor at the School of Artificial Intelligence, Shandong University, and a member of the General Artificial Intelligence Laboratory (SDU GAIL). I received my Ph.D. degree from the Institute of Automation, Chinese Academy of Sciences, advised by Prof. Guoliang Fan (范国梁). My research interests include Reinforcement Learning, Multi-agent System, and Large Language Model (LLM) Agents.

I am looking for cooperation opportunities. And I am closely collaborating with Prof. Bin Zhang (张斌) from the Institute of Automation, Chinese Academy of Sciences. If you are interested with my experience or research, please feel free to contact me via Wechat or Email(zhiwei_xu@sdu.edu.cn).

每年招收硕士研究生和科研助理，欢迎具有人工智能、强化学习或者大模型基础，以及对智能体与多智能体系统、智能决策感兴趣的推免生或者大三学生, 与我邮件(zhiwei_xu@sdu.edu.cn)或者微信联系。

🔥 News

2026.02: One paper accepted to ICAPS 2026! Congratulations to Yuanjun! 🎉
2026.01: One first-author paper accepted to ICLR 2026!
2025.12: One paper accepted to AAMAS 2026!
2025.11: One paper accepted to AAAI 2026! Congratulations to Jiwei! 🎉
2025.09: One paper accepted to NeurIPS 2025!
2025.08: One paper accepted to EMNLP 2025!
2025.06: 🏆 Awarded with the China Excellent Doctoral Dissertation Award in Agents and Multi-Agent Systems (Runner-up)!
2025.05: One first-author paper accepted to ICML 2025!
2024.12: One paper accepted to AAMAS 2025!
2024.12: Two papers accepted to AAAI 2025!

📖 Experience

2024.07 - Present Assistant Professor in School of Artificial Intelligence, Shandong University
2019.09 - 2024.06 Ph.D. in Institute of Automation, Chinese Academy of Sciences Supervisor: Prof. Guoliang Fan
2015.09 - 2019.06 B.E. in Wu Yuzhang Honors College, Sichuan University

📝 Publications

Conferences:

QSIM: Mitigating Overestimation in Multi-Agent Reinforcement Learning via Action Similarity Weighted Q-Learning
International Conference on Automated Planning and Scheduling (ICAPS), in Dublin, Ireland, 2026.
Yuanjun Li, Bin Zhang, Hao Chen, Zhouyang Jiang, Dapeng Li, and Zhiwei Xu
[Arxiv][Code]
Peak-Return Greedy Slicing: Subtrajectory Selection for Transformer-based Offline RL
International Conference on Learning Representations (ICLR), in Rio de Janeiro, Brazil, 2026.
Zhiwei Xu, Miduo Cui, Dapeng Li, Zhihao Liu, Haifeng Zhang, Hangyu Mao, Guoliang Fan, and Bin Zhang
[OpenReview][Code]
Quality-Diversity for Multi-Agent Reinforcement Learning
International Conference on Autonomous Agents and Multi-Agent Systems(AAMAS), in Paphos, Cyprus, 2026.
Hao Chen, Pengyi Li, Bin Zhang, Hu Fu, Zhiwei Xu, Ce Zhang, Xinyue Lu, and Guoliang Fan
Graph of Verification: Structured Verification of LLM Reasoning with Directed Acyclic Graphs
The 40th Annual AAAI Conference on Artificial Intelligence(AAAI), in Singapore, 2026. (Poster)
Jiwei Fang, Bin Zhang, Changwei Wang, Jin Wan, and Zhiwei Xu
[Arxiv][Code]
Belief-Calibrated Multi-Agent Consensus Seeking for Complex NLP Tasks
Thirty-ninth Conference on Neural Information Processing Systems(NeurIPS), in Mexico City, 2025. (Poster)
Wentao Deng, Jiahuan Pei, Zhiwei Xu, Zhaochun Ren, Zhumin Chen, and Pengjie Ren
[Arxiv][Code]
Bridging the Capability Gap: Joint Alignment Tuning for Harmonizing LLM-based Multi-Agent Systems
The 2025 Conference on Empirical Methods in Natural Language Processing(EMNLP), in Suzhou, China, 2025. (Findings)
Minghang Zhu, Zhengliang Shi, Zhiwei Xu, Shiguang Wu, Lingjie Wang, Pengjie Ren, Zhaochun Ren, and Zhumin Chen
[Arxiv][Code]
Reidentify: Context-Aware Identity Generation for Contextual Multi-Agent Reinforcement Learning
Forty-second International Conference on Machine Learning(ICML), in Vancouver, Canada, 2025.
Zhiwei Xu, Kun Hu, Xin Xin, Weiliang Meng, Yiwei Shi, Hangyu Mao, Bin Zhang, Dapeng Li, and Jiangjin Yin
[OpenReview]
Unveiling Decision Intention for Cooperative Multi-Agent Reinforcement Learning
International Conference on Autonomous Agents and Multi-Agent Systems(AAMAS), in Detroit, Michigan, USA, 2025.
Zeren Zhang, Zhiwei Xu, Guangchong Zhou, Dapeng Li, Bin Zhang, and Guoliang Fan
Efficient Communication in Multi-Agent Reinforcement Learning with Implicit Consensus Generation
The 39th Annual AAAI Conference on Artificial Intelligence(AAAI), in Philadelphia, Pennsylvania, USA, 2025.
Dapeng Li, Na Lou, Zhiwei Xu, Bin Zhang, and Guoliang Fan
Focus on Local: Finding Reliable Discriminative Regions for Visual Place Recognition
The 39th Annual AAAI Conference on Artificial Intelligence(AAAI), in Philadelphia, Pennsylvania, USA, 2025.
Changwei Wang, Shunpeng Chen, Yukun Song, Rongtao Xu, Zherui Zhang, Jiguang zhang, Haoran Yang, Yu Zhang, Kexue Fu, Shide Du, Zhiwei Xu, Longxiang Gao, Li Guo, and Shibiao Xu
Decentralized Extension for Centralized Multi-Agent Reinforcement Learning via Online Distillation
International Conference on Neural Information Processing(ICONIP), in Auckland, New Zealand, 2024.
Zeren Zhang, Bin Zhang, Guangchong Zhou, Dapeng Li, Zhiwei Xu, and Guoliang Fan
Stackelberg Decision Transformer for Asynchronous Action Coordination in Multi-Agent Systems
Forty-first International Conference on Machine Learning(ICML), in Vienna, Austria, 2024.
Bin Zhang, Hangyu Mao, Lijuan Li, Zhiwei Xu, Dapeng Li, Rui Zhao, and Guoliang Fan
[Arxiv]
PDiT: Interleaving Perception and Decision-making Transformers for Deep Reinforcement Learning
International Conference on Autonomous Agents and Multi-Agent Systems(AAMAS), in Auckland, New Zealand, 2024. (Full Paper)
Hangyu Mao, Rui Zhao, Ziyue Li, Zhiwei Xu, Hao Chen, Yiqun Chen, Bin Zhang, Zhen Xiao, Junge Zhang, and Jiangjin Yin
[Arxiv][Code]
From Explicit Communication to Tacit Cooperation:A Novel Paradigm for Cooperative MARL
International Conference on Autonomous Agents and Multi-Agent Systems(AAMAS), in Auckland, New Zealand, 2024. (Extended Abstract)
Dapeng Li, Zhiwei Xu, Bin Zhang, and Guoliang Fan
[Arxiv]
Adaptive Parameter Sharing for Multi-Agent Reinforcement Learning
IEEE International Conference on Acoustics, Speech and Signal Processing(ICASSP), in Seoul, Korea, 2024.
Dapeng Li, Na Lou, Bin Zhang, Zhiwei Xu, and Guoliang Fan
[Arxiv]
Dual Self-Awareness Value Decomposition Framework without Individual Global Max for Cooperative MARL
Thirty-seventh Conference on Neural Information Processing Systems(NeurIPS), in New Orleans, USA, 2023. (Poster)
Zhiwei Xu, Bin Zhang, Dapeng Li, Guangchong Zhou, Zeren Zhang, and Guoliang Fan
[Arxiv]
Mastering Complex Coordination through Attention-based Dynamic Graph
International Conference on Neural Information Processing(ICONIP), in Changsha, China, 2023.
Guangchong Zhou, Zhiwei Xu, Zeren Zhang, and Guoliang Fan
[Arxiv]
SORA: Improving Multi-agent Cooperation with a Soft Role Assignment Mechanism
International Conference on Neural Information Processing(ICONIP), in Changsha, China, 2023.
Guangchong Zhou, Zhiwei Xu, Zeren Zhang, and Guoliang Fan
Inducing Stackelberg Equilibrium through Spatio-Temporal Sequential Decision-Making in Multi-Agent Reinforcement Learning
32nd International Joint Conference on Artificial Intelligence(IJCAI), in Macao, S.A.R, China, 2023.
Bin Zhang, Lijuan Li, Zhiwei Xu, Dapeng Li, and Guoliang Fan
[Arxiv]
SEA: A Spatially Explicit Architecture for Multi-Agent Reinforcement Learning
International Joint Conference on Neural Networks(IJCNN), in Queensland, Australia, 2023.
Dapeng Li, Zhiwei Xu, Bin Zhang, and Guoliang Fan
[Arxiv]
Hierarchical Multi-Agent Reinforcement Learning with Intrinsic Reward Rectification
IEEE International Conference on Acoustics, Speech and Signal Processing(ICASSP), in Rhodes island, Greece, 2023. (Poster)
Zhihao Liu, Zhiwei Xu, and Guoliang Fan
Consensus Learning for Cooperative Multi-Agent Reinforcement Learning
Thirty-Seventh AAAI Conference on Artificial Intelligence(AAAI), in Washington, DC, USA, 2023. (Oral)
Zhiwei Xu, Bin Zhang, Dapeng Li, Zeren Zhang, Guangchong Zhou, Hao Chen, and Guoliang Fan
[Arxiv][Code]
HAVEN: Hierarchical Cooperative Multi-Agent Reinforcement Learning with Dual Coordination Mechanism
Thirty-Seventh AAAI Conference on Artificial Intelligence(AAAI), in Washington, DC, USA, 2023. (Oral)
Zhiwei Xu, Yunpeng Bai, Bin Zhang, Dapeng Li, and Guoliang Fan
[Arxiv][Code]
Mingling Foresight with Imagination: Model-Based Cooperative Multi-Agent Reinforcement Learning
Thirty-sixth Conference on Neural Information Processing Systems(NeurIPS), in New Orleans, USA, 2022. (Spotlight)
Zhiwei Xu, Dapeng Li, Bin Zhang, Yuan Zhan, Yunpeng Bai, and Guoliang Fan
[Arxiv]
Multi-Agent Hyper-Attention Policy Optimization
International Conference on Neural Information Processing(ICONIP), in New Delhi, India, 2022.
Bin Zhang*, Zhiwei Xu*, Yiqun Chen*, Dapeng Li, Yunpeng Bai, Guoliang Fan, and Lijuan Li
Efficient Policy Generation in Multi-Agent Systems via Hypergraph Neural Network
International Conference on Neural Information Processing(ICONIP), in New Delhi, India, 2022.
Bin Zhang, Yunpeng Bai, Zhiwei Xu, Dapeng Li, and Guoliang Fan
[Arxiv]
Learn Effective Representation for Deep Reinforcement Learning
IEEE International Conference on Multimedia and Expo(ICME), in Taipei, 2022. (Oral)
Yuan Zhan, Zhiwei Xu, and Guoliang Fan
SIDE: State Inference for Partially Observable Cooperative Multi-Agent Reinforcement Learning
International Conference on Autonomous Agents and Multi-Agent Systems(AAMAS), in Auckland, New Zealand, 2022. (Full Paper)
Zhiwei Xu, Yunpeng Bai, Dapeng Li, Bin Zhang, and Guoliang Fan
[Arxiv][Code]
Learning to Coordinate via Multiple Graph Neural Networks
International Conference on Neural Information Processing(ICONIP), in BALI, Indonesia, 2021.
Zhiwei Xu, Bin Zhang, Yunpeng Bai, Dapeng Li, and Guoliang Fan
[Arxiv][Code]
MMD-MIX: Value Function Factorisation with Maximum Mean Discrepancy for Cooperative Multi-Agent Reinforcement Learning
International Joint Conference on Neural Networks(IJCNN), in Shenzhen, China, 2021. (Poster)
Zhiwei Xu, Dapeng Li, Yunpeng Bai, and Guoliang Fan
[Arxiv]

Journals:

An Evolutionary Reinforcement Learning Framework for Joint Work Package Sizing and Scheduling with Uncertainties
European Journal of Operational Research, in press, 2026.
Nianmin Zhang, Xiao Li, and Zhiwei Xu

Pre-prints:

Beyond Local Views: Global State Inference with Diffusion Models for Cooperative Multi-Agent Reinforcement Learning
Zhiwei Xu, Hangyu Mao, Nianmin Zhang, Xin Xin, Pengjie Ren, Dapeng Li, Bin Zhang, Guoliang Fan, Zhumin Chen, Changwei Wang, and Jiangjin Yin
[Arxiv]
Verco: Learning Coordinated Verbal Communication for Multi-agent Reinforcement Learning
Dapeng Li, Hang Dong, Lu Wang, Bo Qiao, Si Qin, Qingwei Lin, Dongmei Zhang, Qi Zhang, Zhiwei Xu, Bin Zhang, and Guoliang Fan
[Arxiv]
Controlling Large Language Model-based Agents for Large-Scale Decision-Making: An Actor-Critic Approach
Bin Zhang, Hangyu Mao, Jingqing Ruan, Ying Wen, Yang Li, Shao Zhang, Zhiwei Xu, Dapeng Li, Ziyue Li, Rui Zhao, Lijuan Li, and Guoliang Fan
[Arxiv]
TPTU: Task Planning and Tool Usage of Large Language Model-based AI Agents
Jingqing Ruan*, Yihong Chen*, Bin Zhang*, Zhiwei Xu*, Tianpeng Bao*, Guoqing Du*, Shiwei Shi*, Hangyu Mao*, Ziyue Li, Xingyu Zeng, and Rui Zhao
[Arxiv]
Style Miner: Find Significant and Stable Explanatory Factors in Time Series with Constrained Reinforcement Learning
Dapeng Li, Feiyang Pan, Jia He, Zhiwei Xu, Dandan Tu, and Guoliang Fan
[Arxiv]

💻 Services

Program Committee Member or Reviewer:

Neural Information Processing Systems (NeurIPS)
International Conference on Learning Representations (ICLR)
International Conference on Machine Learning (ICML)
AAAI Conference on Artificial Intelligence (AAAI)
International Joint Conference on Artificial Intelligence (IJCAI)
International Conference on Autonomous Agents and Multiagent Systems (AAMAS)

Academic Committee Membership:

Executive Committee Member, Multi-Agent Systems Technical Group, CCF-AI

🎖 Honors and Awards

2025 China Excellent Doctoral Dissertation Award in Agents and Multi-Agent Systems (Runner-up)
2023 National Scholarship for doctoral students, Ministry of Education
2022 Merit Student, University of Chinese Academy of Sciences
2019 Outstanding Undergraduate, Sichuan University
2016 National Scholarship for undergraduate students, Ministry of Education

Zhiwei Xu 徐志伟