Wing Lian
343714972b
recommend padding when using sample packing ( #531 )
2023-09-06 17:00:21 -04:00
Wing Lian
3355706e22
Add support for GPTQ using native transformers/peft ( #468 )
...
* auto gptq support
* more tweaks and add yml
* remove old gptq docker
* don't need explicit peft install for tests
* fix setup.py to use extra index url
install torch for tests
fix cuda version for autogptq index
set torch in requirements so that it installs properly
move gptq install around to work with github cicd
* gptq doesn't play well with sample packing
* address pr feedback
* remove torch install for now
* set quantization_config from model config
* Fix the implementation for getting quant config from model config
2023-09-05 12:43:22 -04:00
Charles O. Goddard
fe4d6baf92
Add example Llama 2 ReLoRA config ( #471 )
...
* Add example Llama 2 ReLoRA config
* Use adamw_bnb_8bit in example relora config
2023-08-27 10:08:34 +09:00
Wing Lian
1687be6a35
don't use mask expansion for inference ( #392 )
2023-08-14 20:52:54 -04:00
mhenrichsen
fdffef5940
new llama-2 default settings ( #370 )
...
* new default settings
* fix whitespace
* rm max packed sequence length
---------
Co-authored-by: Mads Henrichsen <mads@BrbartiendeMads.lan >
2023-08-14 17:39:09 +09:00
Morgan McGuire
7019509daa
Add wandb_entity to wandb options, update example configs, update README ( #361 )
...
* Update wandb_entity and add wandb descriptions
* add wandb to config section
* remove trailing whitespace for pre-commit hook
* remove trailing whitespace for pre-commit hook
---------
Co-authored-by: Morgan McGuire <morganmcguire@Morgans-MacBook-Pro.local >
Co-authored-by: Wing Lian <wing.lian@gmail.com >
2023-08-12 12:17:11 -04:00
Aman Karmani
36fefcf94b
set group_by_length to false in examples
2023-08-06 23:59:09 -07:00
mhenrichsen
dc71d8872a
feat/llama-2 examples ( #319 )
...
* qlora llama-2
* qlora llama-2
* linting
* readme
* lora added
* linting
* change group_by_length
* 13b fitting on 24gb
* grouped lengths true
* add pad token
* change out dir
---------
Co-authored-by: Mads Henrichsen <mads@Brbar-tilhrende-Mads.local >
2023-08-03 19:22:48 +09:00