misc sharegpt fixes (#723)

* support for sharegpt with assistant talking first, better masking of assistant token, allow remap of roles from dataset

* invalid role is actually not possible

* update tokenized fixture for corrected labels
This commit is contained in:
Wing Lian
2023-10-13 11:04:39 -04:00
committed by GitHub
parent bfbdba8614
commit f30afe4544
4 changed files with 107 additions and 36 deletions

File diff suppressed because one or more lines are too long