Galvatron, a Efficient Transformer Training Framework for Multiple GPUs Using Automatic Parallelism
pip install hetu-galvatron