axolotl

Author	SHA1	Message	Date
salman	54dd7abfc1	Process reward models (#2241 ) * adding model_cfg to set num_labels * using a num_labels field instead * linting * WIP stepwise prompt tokenizer * this should work? * trainer working? * pushing to runpod * fixing saving * updating conf * updating config, adding docs * adding stepwise supervision docpage * updating tests * adding test for dataset * fixing tests * linting * addressing some comments * adding additional cfg fields support * updating tests, fixing cfg * fixing tests * updating loss * Update test_process_reward_model_smollm2.py * updating loss values and seed * dumb pre-commit	2025-01-29 00:08:33 -05:00
Sunny Liu	1c14c4a15c	Add hub model id config options to all example yml files (#2196 ) [skip ci] * added hub model_id in example yml * add hub model id to example yml	2024-12-17 11:24:30 -05:00
NanoCode012	8c3a727f9d	feat: update yml chat_template to specify dataset field (#2001 ) [skip ci] * feat: update yml chat_template to specify dataset field * feat: replace sharegpt references with chat_template	2024-10-29 10:26:03 -04:00
Wing Lian	68b1369de9	Reward model (#1879 )	2024-10-13 15:11:13 -04:00
Wing Lian	5370cedf0c	support for gemma2 w sample packing (#1718 )	2024-06-29 01:38:55 -04:00