axolotl

Author	SHA1	Message	Date
Wing Lian	f243c2186d	RL/DPO (#935 ) * ipo-dpo trainer * fix missing abstract method * chatml template, grad checkpointing kwargs support * fix steps calc for RL and add dataloader kwargs * wip to fix dpo and start ppo * more fixes * refactor to generalize map fn * fix dataset loop and handle argilla pref dataset * set training args * load reference model on seperate gpu if more than one device * no auto upload to hub for dpo, don't add lora adapters to ref model for dpo * fixes for rl training * support for ipo from yaml * set dpo training args from the config, add tests * chore: lint * set sequence_len for model in test * add RLHF docs	2024-01-04 18:22:55 -05:00
NanoCode012	9f7e8a971d	feat(doc): add dummyoptim faq fix (#802 )	2023-10-29 23:06:06 +09:00
Wing Lian	a21935f07a	add to docs (#703 )	2023-10-19 21:32:30 -04:00
Wing Lian	2aa1f71464	fix pytorch 2.1.0 build, add multipack docs (#722 )	2023-10-13 08:57:28 -04:00
Maxime	c1382e79b6	Create multi-node.md (#613 ) * Create multi-node.md * Update multi-node.md * Update multi-node.md	2023-09-20 22:02:16 -04:00
The Objective Dad	5e2d8a42d9	Adding NCCL Timeout Guide (#536 ) * fixes NCCL_P2P_LEVEL=NVL #429 * adding more insights into verious values of NCCL_P2P_LEVEL	2023-09-08 11:57:47 -04:00