Wing Lian
5f79b8242f
new evals_per_epoch and saves_per_epoch to make things cleaner ( #944 )
...
* new evals_per_epoch and saves_per_epoch to make things cleaner
* update per PR feedback
2023-12-12 15:35:23 -05:00
NanoCode012
a1da39cd48
Feat(wandb): Refactor to be more flexible ( #767 )
...
* Feat: Update to handle wandb env better
* chore: rename wandb_run_id to wandb_name
* feat: add new recommendation and update config
* fix: indent and pop disabled env if project passed
* feat: test env set for wandb and recommendation
* feat: update to use wandb_name and allow id
* chore: add info to readme
2023-12-04 22:17:25 +09:00
Wing Lian
f544ab2bed
don't compile deepspeed or bitsandbytes from source ( #837 )
2023-11-08 19:49:55 -05:00
Wing Lian
8b79ff0e94
fix eval_steps to be a sane default ( #797 )
...
* fix eval_steps to be a sane default
* update docs for fractional eval_steps
2023-10-27 22:36:30 -04:00
Wing Lian
2d8def68dc
simplify by removing duplicate base_model_config ( #772 )
2023-10-23 01:42:38 -04:00
Wing Lian
e50a64e85e
prepared dataset caching, other misc fixes ( #665 )
...
* prepared dataset caching, other misc fixes
* also don't load from disk cache unless explicit
2023-10-02 21:07:24 -04:00
mhenrichsen
4fecbfe5e1
default model changed
2023-09-24 18:52:53 +02:00
Wing Lian
343714972b
recommend padding when using sample packing ( #531 )
2023-09-06 17:00:21 -04:00
Charles O. Goddard
fe4d6baf92
Add example Llama 2 ReLoRA config ( #471 )
...
* Add example Llama 2 ReLoRA config
* Use adamw_bnb_8bit in example relora config
2023-08-27 10:08:34 +09:00