ORPO Trainer replacement (#1551)

* WIP use trl ORPOTrainer

* fixes to make orpo work with trl

* fix the chat template laoding

* make sure to handle the special tokens and add_generation for assistant turn too
This commit is contained in:
Wing Lian
2024-04-19 17:25:36 -04:00
committed by GitHub
parent 0e8f340945
commit 7d1d22f72f
10 changed files with 151 additions and 26 deletions

View File

@@ -39,6 +39,6 @@ s3fs
gcsfs
# adlfs
trl @ git+https://github.com/huggingface/trl.git@0ee349dcd43b0f4b3169449f16751c38ac4a609f
trl==0.8.5
zstandard==0.22.0
fastcore