* Implementing a basic chat_template strategy for DPO datasets This mimics the sft chat_template strategy such that users can: * Specify the messages field * Specify the per message role and content fields * speicfy the chosen and rejected fields * Let the tokenizer construct the raw prompt * Ensure the chosen and rejected fields don't have any prefix tokens * Adding additional dpo chat template unittests * Rename test class
Llama-3
https://llama.meta.com/llama3/
- Full Fine Tune
- Single GPU @ 48GB VRAM
- LoRA
- Single GPU @ 11GB VRAM
- QLORA+FSDP
- Dual GPU @ 21GB VRAM