Wing Lian
40a6362c92
support for mamba ( #915 )
...
* support for mamba
* more mamba fixes
* use fork for mamba kwargs fix
* grad checkpointing doesn't work
* fix extras for mamaba
* mamba loss fix
* use fp32 and remove verbose logging
* mamba fixes
* fix collator for mamba
* set model_type on training_args
* don't save safetensors for mamba
* update mamba config to disable safetensor checkpooints, install for tests
* no evals for mamba tests
* handle save_pretrained
* handle unused safetensors arg
2023-12-09 12:10:41 -05:00
NanoCode012
a1da39cd48
Feat(wandb): Refactor to be more flexible ( #767 )
...
* Feat: Update to handle wandb env better
* chore: rename wandb_run_id to wandb_name
* feat: add new recommendation and update config
* fix: indent and pop disabled env if project passed
* feat: test env set for wandb and recommendation
* feat: update to use wandb_name and allow id
* chore: add info to readme
2023-12-04 22:17:25 +09:00
kallewoof
58ec8b1113
feature: loss watchdog for terminating training runs that are failing ( #899 )
...
Co-authored-by: Karl-Johan Alm <kalle@gmail.com >
2023-12-04 07:54:34 -05:00
NanoCode012
a48dbf6561
fix: remove FA for qwen examples ( #900 )
...
* fix: remove FA for qwen lora
* fix: remove FA for qlora
2023-11-27 21:23:54 +09:00
NanoCode012
1115c501b8
Feat: Add Qwen ( #894 )
...
* Feat: Add Qwen
* feat: add qwen lora example
* feat: update matrix
* fix: add trust_remote_code
* fix: disable gradient checkpointing
* chore: add warning about gradient checkpointing
* fix: config
* fix: turn off sample packing for this example and reduce seq len
* chore: add comment on seq len
2023-11-26 00:05:01 +09:00
Wing Lian
9bf854e59c
Phi update 202311 ( #876 )
...
* add phi modeling from hf
* update for packing and use new modeling class for phi
* update e2e tests for phi to use new model name
* update example phi to also use new phi model name
* use AutoModelForCausalLM for phi lora since sample packing isn't supported
2023-11-17 12:47:17 -05:00
Wing Lian
14706504e3
various bugfixes ( #856 )
...
* various bugfixes
use latest tinyllama release
check if val_set_size is empty first
update sdp and xformers llama patches for updated upstream transformers
fix system prompt when no input
calculate total and total supervised tokens even when not sample packing
* add fix for when eval size is estimated to be too small
* should be len 1 for dataset length
* add catchall kwargs
2023-11-15 12:23:18 -05:00
Wing Lian
f544ab2bed
don't compile deepspeed or bitsandbytes from source ( #837 )
2023-11-08 19:49:55 -05:00
Wing Lian
8b79ff0e94
fix eval_steps to be a sane default ( #797 )
...
* fix eval_steps to be a sane default
* update docs for fractional eval_steps
2023-10-27 22:36:30 -04:00
Wing Lian
9b43e7ea15
disable eval table w sample packing in examples ( #778 )
2023-10-23 09:18:44 -04:00
Wing Lian
2d8def68dc
simplify by removing duplicate base_model_config ( #772 )
2023-10-23 01:42:38 -04:00
Casper
15d3a654bf
Implement fused modules ( #747 )
...
* MLP: Memory saving
* Remove RMSNorm restrictions
* Map packed weights to original
* FusedAttention module
* Simplify code
* Move fused modules
* Fix critical typo
* Split inplace
* Add FFT config
* Add validation of fused arguments
* Add fused arguments to config
* Update docs
* Fix validation logic
* Add fused modules to flash attn
* Only fuse during training
* Remove timing
* Formatting
* Formatting
* Formatting
* chore: lint
* chore: lint
* add e2e tests for fused llama
* no lora for tests
---------
Co-authored-by: Wing Lian <wing.lian@gmail.com >
2023-10-21 16:08:25 -04:00
atgctg
ace70b33c6
Fix: lowercase True values in config ( #713 )
...
* Fix: lowercase `True` values in config
* Fix: lowercase `True` values in config
2023-10-10 21:32:20 +09:00
lukemarsden
295b2662e1
Get qlora mistral-7b fine tuning working on a single 4090 ( #708 )
2023-10-10 15:14:23 +09:00
mhenrichsen
f91db198f3
fix unneeded space ( #699 )
2023-10-07 14:19:25 -04:00
mhenrichsen
83a950bb87
lint
2023-10-07 11:04:35 +02:00
mhenrichsen
4c8ddf2c6f
new lr, sample pack
2023-10-06 22:58:13 +02:00
NanoCode012
669f1d052c
Fix: Higher vram usage for mistral and sample_packing ( #691 )
...
* Fix: Higher vram usage for mistral and sample_packing
* chore: update comment
* chore: lint
2023-10-06 12:33:43 -04:00
Abhishek Mishra
d4a88e4eca
Adding qlora config for Mistral ( #675 )
...
* Adding qlora config for Mistral
Contains fix for Mistral FA issue - ValueError: You are attempting to perform batched generation with padding_side='right' this may lead to unexpected behaviour for Flash Attention version of Mistral. Make sure to call tokenizer.padding_side = 'left' before tokenizing the input.
Fix for now is to set sample_packing: true and pad_to_sequence_len: true
* Renamed to qlora.yml
2023-10-06 21:05:56 +09:00
Wing Lian
e50a64e85e
prepared dataset caching, other misc fixes ( #665 )
...
* prepared dataset caching, other misc fixes
* also don't load from disk cache unless explicit
2023-10-02 21:07:24 -04:00
Adarsh Shirawalmath
b88f51512a
Update mistral/README.md ( #647 )
2023-09-28 10:24:56 -04:00
NanoCode012
eb41f76f92
Feat: Add example for Mistral ( #644 )
...
* Feat: Add example for Mistral
* chore: turn off flash
* chore: add is_mistral_derived_model
* chore: update following PR
2023-09-28 20:15:00 +09:00
Wing Lian
d887ad86c3
eval_table isn't quite stable enough to be in default llama configs ( #637 )
2023-09-26 10:13:20 -04:00
NanoCode012
19a600a8b8
Feat: Add support for upstream FA2 ( #626 )
...
* Feat: Add support for upstream FA2
* chore: add is_falcon_derived_model: true to examples
* chore: add config to readme for documentation
* feat: add extra model types
* fix: remove old falcon flash patch
* chore: pin transformers and accelerate
2023-09-26 09:53:28 -04:00
mhenrichsen
4fecbfe5e1
default model changed
2023-09-24 18:52:53 +02:00
Wing Lian
faecff9798
support to disable exllama for gptq ( #604 )
...
* support to disable exllama for gptq
* update property instead of item
* fix config key
2023-09-19 17:51:08 -04:00
Wing Lian
674c57692d
more sane defaults for openllama 3b used for quickstarts ( #602 )
...
* more sane defaults for openllama 3b used for quickstarts
* don't use bf16 for quickstart to simplify gpu compatibility
* use the update openlm-research/open_llama_3b_v2 models
2023-09-19 09:15:10 -04:00
Wing Lian
6b9b229356
btlm and falcon monkey patches for flash attn ( #566 )
2023-09-17 13:49:18 -04:00
Wing Lian
62eaee7649
make phi training work with Loras ( #588 )
...
* valdiation for phi loras
* fix model config class check
* update readme for phi traiing
2023-09-15 20:51:55 -04:00
Wing Lian
12a2dbbc2c
Support Sample packing for phi arch ( #586 )
...
* phi sequence packing
* sample packing fixes
* fix linting
* fix inference and phi e2e tests
* update phi example now that sample packing works
* wandb import keeps getting moved around
2023-09-15 15:46:54 -04:00
Doan Minh Phuong
1aa400721e
Fix Codellama examples ( #582 )
...
* Fix seq_len
* Update lora.yml
* Update qlora.yml
* Update lora.yml
* Update lora.yml
* Update qlora.yml
2023-09-15 04:19:13 -04:00
Wing Lian
228420972e
Phi examples ( #569 )
...
* add phi full ft example
* Add readme to point out that deepspeed should be used
* zero1 is better than zero2 for phi
2023-09-14 11:17:47 -04:00
Glavin Wiechert
5b67ea98a6
Add training callback to send predictions to WandB table ( #521 )
...
* WIP Add training callback to send predictions to WandB table
* WIP improve wandb table reporting callback
* WIP improve wandb table reporting callback (cont)
* Add VSCode launching for debugging
* Add tiny llama example
* WIP attempt to improve post-eval prediction generation for table
* WIP attempt to improve post-eval prediction generation for table - part 2
* WIP batch generation
* WIP attempt to handle sample_packing using position_ids for wandb prediction table
* WIP add code for debugging
* Fix sample_packing support for wandb prediction table
* Clean up code for PR review
* Add eval_table_size, eval_table_max_new_tokens configs & clean up code
* Clean up PR, delete VSCode config, add tiny-llama example
* Add eval_table_size, eval_table_max_new_tokens documentation. Fix linting/formatting
2023-09-13 09:51:08 -04:00
Wing Lian
343714972b
recommend padding when using sample packing ( #531 )
2023-09-06 17:00:21 -04:00
Wing Lian
3355706e22
Add support for GPTQ using native transformers/peft ( #468 )
...
* auto gptq support
* more tweaks and add yml
* remove old gptq docker
* don't need explicit peft install for tests
* fix setup.py to use extra index url
install torch for tests
fix cuda version for autogptq index
set torch in requirements so that it installs properly
move gptq install around to work with github cicd
* gptq doesn't play well with sample packing
* address pr feedback
* remove torch install for now
* set quantization_config from model config
* Fix the implementation for getting quant config from model config
2023-09-05 12:43:22 -04:00
Birch-san
8e197f6fb4
pad_to_worst_case_seq_len boolean, for testing memory limits ( #498 )
...
* pad_to_worst_case_seq_len boolean, for testing memory limits
* remove collator_pad_to_longest option since it does nothing
see docs: https://huggingface.co/docs/transformers/main_classes/data_collator#transformers.DataCollatorWithPadding.padding
True and "longest" mean the same thing
* rename to `pad_to_sequence_len, and ensure 64 alignment
---------
Co-authored-by: Aman Karmani <aman@tmm1.net >
2023-08-28 18:47:16 -04:00
mhenrichsen
35130711d6
Feat(cfg): Add code-llama configs for all sizes ( #479 )
...
* configs for all sizes
* update tokenizer type
---------
Co-authored-by: mhenrichsen <some_email@hey.com >
2023-08-27 10:20:17 +09:00
Charles O. Goddard
fe4d6baf92
Add example Llama 2 ReLoRA config ( #471 )
...
* Add example Llama 2 ReLoRA config
* Use adamw_bnb_8bit in example relora config
2023-08-27 10:08:34 +09:00
Wing Lian
cb9797ef5a
improve llama pad token handling ( #475 )
...
* improve llama pad token handling
* tweak logic to not clobber
2023-08-24 13:20:35 -04:00
Wing Lian
1687be6a35
don't use mask expansion for inference ( #392 )
2023-08-14 20:52:54 -04:00
mhenrichsen
fdffef5940
new llama-2 default settings ( #370 )
...
* new default settings
* fix whitespace
* rm max packed sequence length
---------
Co-authored-by: Mads Henrichsen <mads@BrbartiendeMads.lan >
2023-08-14 17:39:09 +09:00
Morgan McGuire
7019509daa
Add wandb_entity to wandb options, update example configs, update README ( #361 )
...
* Update wandb_entity and add wandb descriptions
* add wandb to config section
* remove trailing whitespace for pre-commit hook
* remove trailing whitespace for pre-commit hook
---------
Co-authored-by: Morgan McGuire <morganmcguire@Morgans-MacBook-Pro.local >
Co-authored-by: Wing Lian <wing.lian@gmail.com >
2023-08-12 12:17:11 -04:00
Aman Karmani
36fefcf94b
set group_by_length to false in examples
2023-08-06 23:59:09 -07:00
mhenrichsen
dc71d8872a
feat/llama-2 examples ( #319 )
...
* qlora llama-2
* qlora llama-2
* linting
* readme
* lora added
* linting
* change group_by_length
* 13b fitting on 24gb
* grouped lengths true
* add pad token
* change out dir
---------
Co-authored-by: Mads Henrichsen <mads@Brbar-tilhrende-Mads.local >
2023-08-03 19:22:48 +09:00
Ethan Smith
38811434e6
Add XGen info to README and example config
2023-07-21 00:44:50 -07:00
Steffen Röcker
945c4191a3
Use AutoTokenizer for redpajama example
2023-06-14 20:09:26 +02:00
Wing Lian
16bb6276a5
Merge pull request #92 from OpenAccess-AI-Collective/flash-optimum
...
add support for opimum bettertransformers
2023-06-14 07:50:15 -04:00
Wing Lian
fd2c9814c9
Merge branch 'main' into flash-optimum
2023-06-12 13:12:15 -04:00
Wing Lian
2ba4ae8f46
tweak config to work
2023-06-12 10:07:18 -04:00
Wing Lian
94f310c7a6
Merge pull request #193 from OpenAccess-AI-Collective/config-fixes-20230612
...
config fixes
2023-06-12 08:24:52 -04:00