Commit Graph

11 Commits

Author SHA1 Message Date
Wing Lian
5f79b8242f new evals_per_epoch and saves_per_epoch to make things cleaner (#944)
* new evals_per_epoch and saves_per_epoch to make things cleaner

* update per PR feedback
2023-12-12 15:35:23 -05:00
NanoCode012
a1da39cd48 Feat(wandb): Refactor to be more flexible (#767)
* Feat: Update to handle wandb env better

* chore: rename wandb_run_id to wandb_name

* feat: add new recommendation and update config

* fix: indent and pop disabled env if project passed

* feat: test env set for wandb and recommendation

* feat: update to use wandb_name and allow id

* chore: add info to readme
2023-12-04 22:17:25 +09:00
Wing Lian
f544ab2bed don't compile deepspeed or bitsandbytes from source (#837) 2023-11-08 19:49:55 -05:00
Wing Lian
8b79ff0e94 fix eval_steps to be a sane default (#797)
* fix eval_steps to be a sane default

* update docs for fractional eval_steps
2023-10-27 22:36:30 -04:00
Wing Lian
9b43e7ea15 disable eval table w sample packing in examples (#778) 2023-10-23 09:18:44 -04:00
Wing Lian
2d8def68dc simplify by removing duplicate base_model_config (#772) 2023-10-23 01:42:38 -04:00
mhenrichsen
f91db198f3 fix unneeded space (#699) 2023-10-07 14:19:25 -04:00
mhenrichsen
83a950bb87 lint 2023-10-07 11:04:35 +02:00
mhenrichsen
4c8ddf2c6f new lr, sample pack 2023-10-06 22:58:13 +02:00
Wing Lian
e50a64e85e prepared dataset caching, other misc fixes (#665)
* prepared dataset caching, other misc fixes

* also don't load from disk cache unless explicit
2023-10-02 21:07:24 -04:00
NanoCode012
eb41f76f92 Feat: Add example for Mistral (#644)
* Feat: Add example for Mistral

* chore: turn off flash

* chore: add is_mistral_derived_model

* chore: update following PR
2023-09-28 20:15:00 +09:00