dfloat11

0.5.0
38.76k

DFloat11: Fast and memory-efficient GPU inference for losslessly compressed LLMs and diffusion models