Wing Lian
2d8def68dc
simplify by removing duplicate base_model_config ( #772 )
2023-10-23 01:42:38 -04:00
Wing Lian
e50a64e85e
prepared dataset caching, other misc fixes ( #665 )
...
* prepared dataset caching, other misc fixes
* also don't load from disk cache unless explicit
2023-10-02 21:07:24 -04:00
Wing Lian
62eaee7649
make phi training work with Loras ( #588 )
...
* valdiation for phi loras
* fix model config class check
* update readme for phi traiing
2023-09-15 20:51:55 -04:00
Wing Lian
12a2dbbc2c
Support Sample packing for phi arch ( #586 )
...
* phi sequence packing
* sample packing fixes
* fix linting
* fix inference and phi e2e tests
* update phi example now that sample packing works
* wandb import keeps getting moved around
2023-09-15 15:46:54 -04:00
Wing Lian
228420972e
Phi examples ( #569 )
...
* add phi full ft example
* Add readme to point out that deepspeed should be used
* zero1 is better than zero2 for phi
2023-09-14 11:17:47 -04:00