Compare commits

...

3 Commits

Author SHA1 Message Date
Dan Saunders
1defb8a955 Merge branch 'main' into destroy-pg 2025-03-31 14:36:43 -04:00
Dan Saunders
70b466aa67 ray bugfix 2025-03-31 18:35:41 +00:00
Dan Saunders
ef6eb77cc8 destroy process group on Ctrl+C / training or eval run (#2457)
* fix nccl pg destroy warning

* update
2025-03-31 12:36:47 -04:00

View File

@@ -509,6 +509,7 @@ def train(
# Save the trained model and cleanup # Save the trained model and cleanup
save_trained_model(cfg, trainer, model, safe_serialization) save_trained_model(cfg, trainer, model, safe_serialization)
create_model_card(cfg, trainer) create_model_card(cfg, trainer)
cleanup_distributed() if not cfg.use_ray:
cleanup_distributed()
return model, tokenizer, trainer return model, tokenizer, trainer