trl

0.28.0
30.42M

Train transformer language models with reinforcement learning.