* cleanup dpo to be a little more extensible, add zephyr/nectar strategy
* fix eos slash
* support for eval split
* fix kwargs
* handle empty evals
* don't load peft model for dpo
* ensure dpo traning args gets bf16 for peft if applicable
* fix duplicate kwargs for bf16
* make sure to respect the configured lr scheduler
* supprt trainer callback to push config to wandb
* set dataloader preload args
* ensure that we are loading the lora when merging
* Update src/axolotl/utils/data.py
Co-authored-by: Agus <agustin.piqueres@gmail.com>
* support local datasets for dpo
Co-authored-by: Agus <agustin.piqueres@gmail.com>
* chore: lint
* dpo/kto/ipo smoke tests w lora, simplify dpo dataset type names
* add split to dpo tests
* fix rebase/merging error
* handle edge case w logging
* use accelerator for dpo datasets so it doesn't break the logger
* missing args
* validate checkpoint is an adapter for now
* log warning when dataset strategy is not loadable
---------
Co-authored-by: Agus <agustin.piqueres@gmail.com>