A collection of tricks to speed up LLMs, see our transformer-tricks papers on arXiv
pip install transformer-tricks