Commit Graph

3 Commits

Author SHA1 Message Date
Wing Lian
1447beb132 make sure to validate the config before normalizing so defaults get set (#2554)
* make sure to validate the config before normalizing so defaults get set

* validation not needed for particular test

* remove duplicate validations

* set qlora correctly
2025-04-24 13:01:43 -04:00
Wing Lian
cf17649ef3 Misc fixes 20250130 (#2301)
* misc fixes for garbage collection and L40S w NCCL P2P

* patch bnb fix for triton check

* chore: lint

* change up import

* try patching differently

* remove patch for bnb fix for now

* more verbose checks and tweak train loss threshold
2025-01-31 08:58:04 -05:00
salman
54dd7abfc1 Process reward models (#2241)
* adding model_cfg to set num_labels

* using a num_labels field instead

* linting

* WIP stepwise prompt tokenizer

* this should work?

* trainer working?

* pushing to runpod

* fixing saving

* updating conf

* updating config, adding docs

* adding stepwise supervision docpage

* updating tests

* adding test for dataset

* fixing tests

* linting

* addressing some comments

* adding additional cfg fields support

* updating tests, fixing cfg

* fixing tests

* updating loss

* Update test_process_reward_model_smollm2.py

* updating loss values and seed

* dumb pre-commit
2025-01-29 00:08:33 -05:00