Xiangyu Li

Xiangyu Li

Ph.D. Candidate, AIR, Tsinghua University

I am a 4th year Ph.D. candidate at AIR, THU, working on efficient edge AI systems. My current research interests focus on model-system co-design for embodied AI, especially VLA training and deployment.

News

See all →

Selected Papers

See all →

OxyGen: Unified KV Cache Management for VLA Inference under Multi-Task Parallelism

Xiangyu Li, Huaizhi Tang, Xin Ding, Weijun Wang, Ting Cao, Yunxin Liu

ArXiv preprint, 2026

Vec-LUT: Vector Table Lookup for Parallel Ultra-Low-Bit LLM Inference on Edge Devices

Xiangyu Li*, Chengyu Yin*, Weijun Wang, Jianyu Wei, Ting Cao, Yunxin Liu

ACM MobiSys 2026Featured Paper (On-Device AI), Results Reproduced @AE

FlexNN: Efficient and Adaptive DNN Inference on Memory-Constrained Edge DevicesCitations: 52

Xiangyu Li, Yuanchun Li, Yuanzhe Li, Ting Cao, Yunxin Liu

ACM MobiCom 2024Results Replicated @AE

Personal LLM Agents: Insights and Survey about the Capability, Efficiency and SecurityCitations: 415

Yuanchun Li, Hao Wen, Weijun Wang, Xiangyu Li, Yizhen Yuan, Guohong Liu, Jiacheng Liu, Wenxing Xu, Xiang Wang, Yi Sun, Rui Kong, Yile Wang, Hanfei Geng, Jian Luan, Xuefeng Jin, Zilong Ye, Guanjing Xiong, Fan Zhang, Xiang Li, Mengwei Xu, Zhijun Li, Peng Li, Yang Liu, Ya-Qin Zhang, Yunxin Liu

ArXiv preprint, 2024Survey & Position, “Efficiency” Section Lead

Latest Posts

See all →

Enhancing GPTQv2 Format Support in vLLM: Analysis and Implementation

Deep technical analysis of GPTQv2 format limitations in vLLM, and implementation of CUDA kernel adaptations to enable efficient low-bit/asymmetric quantization inference.

Vision-Language-Action (VLA) Models: A Review of Recent Progress

Recent VLAs evolve from discrete to continuous, and from single-system (system 1 only) to dual-system.

Reading Notes of Dario Amodei's Blog

Reading Notes of Dario Amodei's Blog.