Tagged: Development

Enhancing GPTQv2 Format Support in vLLM: Analysis and Implementation

Deep technical analysis of GPTQv2 format limitations in vLLM, and implementation of CUDA kernel adaptations to enable efficient low-bit/asymmetric quantization inference.

Cheatsheet for Setting up Android Smartphones

Quickly setting up Android smartphones for development.

Cheatsheet for Setting up Termux on Android Smartphones

Quickly setting up Termux on Android smartphones for development.

Cheatsheet for Setting up Pi Devices

Quickly setting up new single-board computers like Raspberry Pi.