llcuda

0.2.1
1.15k

CUDA-accelerated LLM inference for Python with automatic server management