more dpo fixes for dataset loading and docs (#1185) [skip ci]

* more dpo fixes for dataset loading and docs

* preprocess dpo datasets
This commit is contained in:
Wing Lian
2024-01-24 14:23:55 -05:00
committed by GitHub
parent d85d4942cf
commit 5bce45f800
4 changed files with 73 additions and 4 deletions

View File

@@ -34,6 +34,16 @@ datasets:
rl: ipo
```
#### Using local dataset files
```yaml
datasets:
- ds_type: json
data_files:
- orca_rlhf.jsonl
split: train
type: chatml.intel
```
#### Trl autounwrap for peft
Trl supports autounwrapping peft models, so that a ref model does not need to be additionally loaded, leading to less VRAM needed. This is on by default. To turn it off, pass the following config.