mhenrichsen
4fecbfe5e1
default model changed
2023-09-24 18:52:53 +02:00
Wing Lian
faecff9798
support to disable exllama for gptq ( #604 )
...
* support to disable exllama for gptq
* update property instead of item
* fix config key
2023-09-19 17:51:08 -04:00
Wing Lian
674c57692d
more sane defaults for openllama 3b used for quickstarts ( #602 )
...
* more sane defaults for openllama 3b used for quickstarts
* don't use bf16 for quickstart to simplify gpu compatibility
* use the update openlm-research/open_llama_3b_v2 models
2023-09-19 09:15:10 -04:00
Wing Lian
6b9b229356
btlm and falcon monkey patches for flash attn ( #566 )
2023-09-17 13:49:18 -04:00
Wing Lian
62eaee7649
make phi training work with Loras ( #588 )
...
* valdiation for phi loras
* fix model config class check
* update readme for phi traiing
2023-09-15 20:51:55 -04:00
Wing Lian
12a2dbbc2c
Support Sample packing for phi arch ( #586 )
...
* phi sequence packing
* sample packing fixes
* fix linting
* fix inference and phi e2e tests
* update phi example now that sample packing works
* wandb import keeps getting moved around
2023-09-15 15:46:54 -04:00
Doan Minh Phuong
1aa400721e
Fix Codellama examples ( #582 )
...
* Fix seq_len
* Update lora.yml
* Update qlora.yml
* Update lora.yml
* Update lora.yml
* Update qlora.yml
2023-09-15 04:19:13 -04:00
Wing Lian
228420972e
Phi examples ( #569 )
...
* add phi full ft example
* Add readme to point out that deepspeed should be used
* zero1 is better than zero2 for phi
2023-09-14 11:17:47 -04:00
Glavin Wiechert
5b67ea98a6
Add training callback to send predictions to WandB table ( #521 )
...
* WIP Add training callback to send predictions to WandB table
* WIP improve wandb table reporting callback
* WIP improve wandb table reporting callback (cont)
* Add VSCode launching for debugging
* Add tiny llama example
* WIP attempt to improve post-eval prediction generation for table
* WIP attempt to improve post-eval prediction generation for table - part 2
* WIP batch generation
* WIP attempt to handle sample_packing using position_ids for wandb prediction table
* WIP add code for debugging
* Fix sample_packing support for wandb prediction table
* Clean up code for PR review
* Add eval_table_size, eval_table_max_new_tokens configs & clean up code
* Clean up PR, delete VSCode config, add tiny-llama example
* Add eval_table_size, eval_table_max_new_tokens documentation. Fix linting/formatting
2023-09-13 09:51:08 -04:00
Wing Lian
343714972b
recommend padding when using sample packing ( #531 )
2023-09-06 17:00:21 -04:00
Wing Lian
3355706e22
Add support for GPTQ using native transformers/peft ( #468 )
...
* auto gptq support
* more tweaks and add yml
* remove old gptq docker
* don't need explicit peft install for tests
* fix setup.py to use extra index url
install torch for tests
fix cuda version for autogptq index
set torch in requirements so that it installs properly
move gptq install around to work with github cicd
* gptq doesn't play well with sample packing
* address pr feedback
* remove torch install for now
* set quantization_config from model config
* Fix the implementation for getting quant config from model config
2023-09-05 12:43:22 -04:00
Birch-san
8e197f6fb4
pad_to_worst_case_seq_len boolean, for testing memory limits ( #498 )
...
* pad_to_worst_case_seq_len boolean, for testing memory limits
* remove collator_pad_to_longest option since it does nothing
see docs: https://huggingface.co/docs/transformers/main_classes/data_collator#transformers.DataCollatorWithPadding.padding
True and "longest" mean the same thing
* rename to `pad_to_sequence_len, and ensure 64 alignment
---------
Co-authored-by: Aman Karmani <aman@tmm1.net >
2023-08-28 18:47:16 -04:00
mhenrichsen
35130711d6
Feat(cfg): Add code-llama configs for all sizes ( #479 )
...
* configs for all sizes
* update tokenizer type
---------
Co-authored-by: mhenrichsen <some_email@hey.com >
2023-08-27 10:20:17 +09:00
Charles O. Goddard
fe4d6baf92
Add example Llama 2 ReLoRA config ( #471 )
...
* Add example Llama 2 ReLoRA config
* Use adamw_bnb_8bit in example relora config
2023-08-27 10:08:34 +09:00
Wing Lian
cb9797ef5a
improve llama pad token handling ( #475 )
...
* improve llama pad token handling
* tweak logic to not clobber
2023-08-24 13:20:35 -04:00
Wing Lian
1687be6a35
don't use mask expansion for inference ( #392 )
2023-08-14 20:52:54 -04:00
mhenrichsen
fdffef5940
new llama-2 default settings ( #370 )
...
* new default settings
* fix whitespace
* rm max packed sequence length
---------
Co-authored-by: Mads Henrichsen <mads@BrbartiendeMads.lan >
2023-08-14 17:39:09 +09:00
Morgan McGuire
7019509daa
Add wandb_entity to wandb options, update example configs, update README ( #361 )
...
* Update wandb_entity and add wandb descriptions
* add wandb to config section
* remove trailing whitespace for pre-commit hook
* remove trailing whitespace for pre-commit hook
---------
Co-authored-by: Morgan McGuire <morganmcguire@Morgans-MacBook-Pro.local >
Co-authored-by: Wing Lian <wing.lian@gmail.com >
2023-08-12 12:17:11 -04:00
Aman Karmani
36fefcf94b
set group_by_length to false in examples
2023-08-06 23:59:09 -07:00
mhenrichsen
dc71d8872a
feat/llama-2 examples ( #319 )
...
* qlora llama-2
* qlora llama-2
* linting
* readme
* lora added
* linting
* change group_by_length
* 13b fitting on 24gb
* grouped lengths true
* add pad token
* change out dir
---------
Co-authored-by: Mads Henrichsen <mads@Brbar-tilhrende-Mads.local >
2023-08-03 19:22:48 +09:00
Ethan Smith
38811434e6
Add XGen info to README and example config
2023-07-21 00:44:50 -07:00
Steffen Röcker
945c4191a3
Use AutoTokenizer for redpajama example
2023-06-14 20:09:26 +02:00
Wing Lian
16bb6276a5
Merge pull request #92 from OpenAccess-AI-Collective/flash-optimum
...
add support for opimum bettertransformers
2023-06-14 07:50:15 -04:00
Wing Lian
fd2c9814c9
Merge branch 'main' into flash-optimum
2023-06-12 13:12:15 -04:00
Wing Lian
2ba4ae8f46
tweak config to work
2023-06-12 10:07:18 -04:00
Wing Lian
94f310c7a6
Merge pull request #193 from OpenAccess-AI-Collective/config-fixes-20230612
...
config fixes
2023-06-12 08:24:52 -04:00
NanoCode012
52cde69288
Fix config path after config moved
2023-06-12 17:06:15 +09:00
Wing Lian
9a58e99e81
config fixes
2023-06-12 01:52:58 -04:00
Wing Lian
6b3f509d9e
forgot to add this file
2023-06-11 11:50:12 -04:00
Wing Lian
d0d7eaa4f3
update openllama and clean up paths
2023-06-11 11:03:31 -04:00
Wing Lian
effbbf6dd1
more pruning
2023-06-11 10:38:24 -04:00
Wing Lian
c530e4b9c8
more config pruning and migrating
2023-06-11 10:09:05 -04:00
Wing Lian
77762a5d6b
get rid of some configs, formalize pythioa lora config
2023-06-11 09:41:41 -04:00
Wing Lian
0c6f928601
address PR feedback
2023-06-10 14:23:56 -04:00
Wing Lian
1db46a9c72
linting fix
2023-06-10 14:23:56 -04:00
Wing Lian
39619028a3
use pythia-12b, neox-20b is flaky
2023-06-10 14:22:30 -04:00
NanoCode012
c8242de725
Merge pull request #132 from utensil/falcon-7b-qlora
...
Axolotl supports falcon + qlora
2023-06-09 01:14:03 +09:00
Utensil
79a8f52181
Trim trailing whitespace
2023-06-08 23:48:57 +08:00
Utensil
a52f4816b0
Default wandb_project to empty as suggested
...
Co-authored-by: NanoCode012 <kevinvong@rocketmail.com >
2023-06-08 23:04:19 +08:00
Utensil
c9c050316f
Default micro_batch_size to 1 for a safer start
2023-06-03 17:26:33 +08:00
Utensil
ca11ae9689
Add comments/alternatives for falcon-qlora configs
2023-06-03 15:04:02 +08:00
Utensil
fb3d40f197
falcon + qlora + xformer mbs 40 gas 2 on A6000
2023-06-01 18:29:20 +08:00
Utensil
72bf8aafb6
Create config-7b-qlora.yml
2023-06-01 00:00:37 +08:00
Wing Lian
c2a0792680
swap batch size for gradient accumulation steps to decouple from num gpu
2023-05-31 09:38:12 -04:00
Wing Lian
4df9da74e3
Merge pull request #105 from viktoriussuwandi/viktoriussuwandi-patch
...
Viktoriussuwandi patch
2023-05-30 15:05:23 -04:00
Wing Lian
2531ea24c1
Merge pull request #106 from fearnworks/qlora-openllama-3b-example
...
Qlora openllama 3b example
2023-05-30 15:05:05 -04:00
NanoCode012
392dfd9b07
Lint and format
2023-05-31 02:53:22 +09:00
jphillips
6cee881d64
Update examples/qlora-openllama-3b/README.md
...
Co-authored-by: Wing Lian <wing.lian@gmail.com >
2023-05-30 09:33:33 -05:00
jphillips
ac85c0ed36
Add Readme, Clean up comments
2023-05-29 14:35:58 -05:00
jphillips
370d057096
Add qlora-openllama-3b example
2023-05-29 09:07:46 -05:00