Xiangyu Li

Xiangyu Li

Ph.D. Student, Institute for AI Industry Research (AIR), Tsinghua University

I am a 4th year Ph.D. candidate at AIR, Tsinghua University, working on on-device and physical AI. My research spans efficient inference systems and autonomous embodied agents.

News

See all →

Selected Publications

See all →

OxyGen: Unified KV Cache Management for Vision-Language-Action Models under Multi-Task Parallelism

Xiangyu Li, Huaizhi Tang, Xin Ding, Weijun Wang, Ting Cao, Yunxin Liu

ArXiv preprint, 2026

Vec-LUT: Vector Table Lookup for Parallel Ultra-Low-Bit LLM Inference on Edge Devices

Xiangyu Li*, Chengyu Yin*, Weijun Wang, Jianyu Wei, Ting Cao, Yunxin Liu

ArXiv preprint, 2025 · Conditionally Accepted to MobiSys 2026

FlexNN: Efficient and Adaptive DNN Inference on Memory-Constrained Edge Devices

FlexNN: Efficient and Adaptive DNN Inference on Memory-Constrained Edge Devices

Xiangyu Li, Yuanchun Li, Yuanzhe Li, Ting Cao, Yunxin Liu

MobiCom 2024, 2024 · Oral Presentation, Artifact Evaluated

Personal LLM Agents: Insights and Survey about the Capability, Efficiency and Security

Personal LLM Agents: Insights and Survey about the Capability, Efficiency and Security

Yuanchun Li, Hao Wen, Weijun Wang, Xiangyu Li, Yizhen Yuan, Guohong Liu, Jiacheng Liu, Wenxing Xu, Xiang Wang, Yi Sun, Rui Kong, Yile Wang, Hanfei Geng, Jian Luan, Xuefeng Jin, Zilong Ye, Guanjing Xiong, Fan Zhang, Xiang Li, Mengwei Xu, Zhijun Li, Peng Li, Yang Liu, Ya-Qin Zhang, Yunxin Liu

ArXiv preprint, 2024 · Survey, "Efficiency" Section Lead

Latest Posts

See all →

Enhancing GPTQv2 Format Support in vLLM: Analysis and Implementation

Deep technical analysis of GPTQv2 format limitations in vLLM, and implementation of CUDA kernel adaptations to enable efficient low-bit/asymmetric quantization inference.

Vision-Language-Action (VLA) Models: A Review of Recent Progress

Recent VLAs evolve from discrete to continuous, and from single-system (system 1 only) to dual-system.

Reading Notes of Dario Amodei's Blog

Reading Notes of Dario Amodei's Blog.