trl

0.19.1
15.65M

Train transformer language models with reinforcement learning.