Commit Graph

292 Commits

Author SHA1 Message Date
Wing Lian
2bb0b78975 Attention mask and position id fixes for packing (#285)
* fix attetion mask with packing

* set position ids and use block diagonal attn mask

* fix expand mask for multiple batch items, make sure we pad position_ids

* don't move masks to cpu

* use multi pack dataloader w random sampler

* add position_ids back

* more fixes for dataloader integration

* est total tokens, fix field loop

* more fixes, position_ids seems broken

* more fixes for sample packing

* use distributed sampler, avoid accelerate prepare

* use accelerator prepare for dataloader

* fix for position_ids w packing

* Update src/axolotl/utils/dataloader.py

* validation for sample packing and doc

* more fixes for 4k and optimizations

* optimized expand mask fn

* better handling of variance in multipack dataloader length and trainer hanging when it runs out of data

* fix rounding of len of batches to int

* better handling so that all devices have the same dataloader len

* fix step calc for packing

* pass sample packing efficiency to training args

* add a test for the mask expansion for sequence packing

* only process eval dataset for packing if not None

* don't split batches when packing

* weighted CE losses

* weighted CEL fixes

* limit packing to sequences of max seq len

* seq_len_multiple for packing

* make sure the chunk size is an int

* sample_packing_seq_len_multiplier config

* use cumulative seq len with var len flash attn v2 w packing

* properly calculate max len

* fix flash-attn, xformers, packing, support chatml

* fix chatml system prompt for openorca, legacy tokenizer opts

* add chatml

* add unit tests for cum seq lens, add ability to build cu_seq_lens from positional ids, fix prompt test

* fix test and pylint checks

* more packing and dataset optimizations and fixes

* filter w multiple cpus

* more fixes and optimizations

* fixes and go back to distributed sampler since batch sampler won't work

* fix counts by accounting for num devices

* fix steps calculation

* previous accelerate is still most performant

* add numba to requirements.

* use custom distributed checks

* fix sampler to prevent overfit w new epochs

* let's not cleanup the cached datasets

* calculate cum seq lens with pos_ids instead of mask, simplify packing params, fix distributed barrier

* speed optimizations and set accelerate fsdp env vars

* optimize dataset concatenation?

* more optimizations for dataset handling

* fix import for annotation

* manual pre-commit fixes

* another sum optimization and bug fix for calc steps

* fix packing estimations

* fix formatting

* pylint problems

* add back flash attention branch for handling unpacked sequences seperately

* Address PR feedback

* add optional sample packing config params to readme
2023-08-12 15:14:56 -04:00
Morgan McGuire
7019509daa Add wandb_entity to wandb options, update example configs, update README (#361)
* Update wandb_entity and add wandb descriptions

* add wandb to config section

* remove trailing whitespace for pre-commit hook

* remove trailing whitespace for pre-commit hook

---------

Co-authored-by: Morgan McGuire <morganmcguire@Morgans-MacBook-Pro.local>
Co-authored-by: Wing Lian <wing.lian@gmail.com>
2023-08-12 12:17:11 -04:00
NanoCode012
96bd6ae1c4 Fix(model loading): Warn when model revision is passed to gptq (#364)
* fix(model loading): warn when model revision is passed to gptq

* chore: improve message
2023-08-13 01:16:59 +09:00
NanoCode012
e37d9358e6 Fix(message): Improve error message for bad format (#365) 2023-08-13 01:16:18 +09:00
NanoCode012
b5212068ac Feat: Add rope scaling (#343)
* Feat: Add rope scaling

* fix: move rope config
2023-08-13 00:50:15 +09:00
Aman Gupta Karmani
11ddccb80f Merge pull request #356 from tmm1/load_model-args
simplify `load_model` signature
2023-08-09 18:24:34 -07:00
Aman Karmani
718102271f simplify load_model signature 2023-08-09 22:36:02 +00:00
Aman Karmani
e303d64728 log GPU memory usage 2023-08-09 18:26:28 +00:00
Wing Lian
176b888a63 ensure enable_input_require_grads is called on model before getting the peft model (#345) 2023-08-06 18:13:10 -04:00
Jan Philipp Harries
3392270544 experimental llama 2 chat support (#296)
* experimental llama 2 chat support

* few small fixes

* llama2_chat

* small fix to follow original implementation

* small fixes and added fixtures/tests

* fix -mixed up inference and finetuning conversations

* args - small fix

* small fix

* small adjustment and warning

* fix with pre-commit

---------

Co-authored-by: Jan Philipp Harries <jpdus@users.noreply.github.com>
2023-08-06 17:40:52 -04:00
ssmi153
10405b9995 Update XFormers Attention Monkeypatch to handle Llama-2 70B (GQA) (#339)
* Fix XFormers attention for Llama-2 70B (GQA)

Updated XFormers MonkeyPatch to handle GQA as used in Llama-2 70B. All the updated code is taken directly from the Transformers library: 07360b6c9c (diff-06392bad3b9e97be9ade60d4ac46f73b6809388f4d507c2ba1384ab872711c51) from their llama_modeling.py file.

* Catch configs without pretraining_tp

* Whitespace bug fix

Command had accidentally been moved out of if-else block.

* pre-commit formatting fixes

Thanks to @winglian
2023-08-06 11:09:04 -04:00
Jan Philipp Harries
c93655c0a3 Added Orca Mini prompt strategy (#263)
* added Orca Mini prompt strategy

* maybe this fixed precommit errors?

* pre-commits passing

---------

Co-authored-by: Jan Philipp Harries <jpdus@users.noreply.github.com>
2023-08-06 03:16:41 +09:00
Wing Lian
fe285430bc optimize the iteration when tokenizeing large datasets (#332) 2023-08-04 12:12:05 -04:00
Aman Karmani
2eda9e02a9 fix typo 2023-08-03 21:04:12 +00:00
Aman Karmani
78b9efb7f4 scope flash-attn+qlora fix correctly, scope to llama, add comment 2023-08-03 19:19:39 +00:00
Aman Karmani
312a9fad07 move flash-attn monkey patch alongside the others 2023-08-03 17:20:49 +00:00
Aman Karmani
248bf90f89 ensure flash-attn fixes happen in both adapter/lora modes, and use torch_dtype 2023-08-02 20:15:03 +00:00
Wing Lian
77085ea24e qlora w flash attention fixes (#333) 2023-08-01 23:26:16 -04:00
Wing Lian
db2a3586f3 add peft install back since it doesn't get installed by setup.py (#331) 2023-07-31 16:31:53 -04:00
Wing Lian
3d4984b9a5 update prompts for open orca to match the paper (#317)
fix the test for the updated system tokenizer
2023-07-22 13:49:11 -04:00
Wing Lian
40a53ff181 Merge pull request #307 from OpenAccess-AI-Collective/xgen-user-sharegpt-tokens
better handling since xgen tokenizer breaks with convert_tokens_to_ids
2023-07-22 04:10:38 -04:00
Wing Lian
3ffb018a4c Merge pull request #313 from OpenAccess-AI-Collective/tokenizer-llama2-embeddings
don't resize embeddings to multiples of 32x by default
2023-07-22 04:09:59 -04:00
Wing Lian
1066751358 don't resize embeddings to multiples of 32x by default 2023-07-22 01:52:38 -04:00
Wing Lian
2a428e8014 better handling since xgen tokenizer breaks with convert_tokens_to_ids 2023-07-21 09:24:11 -04:00
Wing Lian
9b790d359b flash attention 2 2023-07-21 08:17:46 -04:00
Wing Lian
a032c9f452 fix sdp attention to use the flash/mem-efficient context manaager 2023-07-20 01:05:48 -04:00
NanoCode012
45ac7c4f88 feat: use multi-core 2023-07-19 10:16:54 +09:00
Wing Lian
ebaec3c406 fix axolotl training args dataclass annotation 2023-07-17 04:57:02 -04:00
Wing Lian
d75adb9835 misc fixes 2023-07-17 03:00:27 -04:00
Wing Lian
6f16c4569d Merge pull request #276 from theobjectivedad/logging_enhancement
Logging update: added PID and formatting
2023-07-16 17:04:52 -04:00
theobjectivedad
b1f4f7a34d Fixed pre-commit problems, fixed small bug in logging_config to handle LOG_LEVEL env var 2023-07-15 12:29:35 +00:00
The Objective Dad
83237b8445 Merge branch 'OpenAccess-AI-Collective:main' into logging_enhancement 2023-07-15 06:16:04 -05:00
Charles Goddard
88089e8b32 Add ability to pass 'name' argument to load_dataset 2023-07-14 16:46:39 -07:00
NanoCode012
168a7a09cc Merge pull request #274 from OpenAccess-AI-Collective/NanoCode012-patch-2
Feat: Set push to hub as private by default
2023-07-14 23:15:47 +09:00
theobjectivedad
9234b75cb4 Update log message format, IMO this is easier to read. 2023-07-14 07:36:21 -05:00
theobjectivedad
553a86b52c Adding logging enhancement 2023-07-14 07:26:19 -05:00
NanoCode012
5491278a79 Feat: Add save_safetensors 2023-07-14 13:21:47 +09:00
NanoCode012
1514739f0f Set push to hub as private by default 2023-07-14 13:17:49 +09:00
Wing Lian
69a235061b support for loading a model by git revision 2023-07-13 22:58:25 -04:00
Wing Lian
c4cf567b55 Merge branch 'main' into quadratic-warmup 2023-07-10 12:42:12 -04:00
Wing Lian
c49729d2bc better configuration for quadratic warmup 2023-07-10 11:52:59 -04:00
Wing Lian
19cf0bda99 params are adam_*, not adamw_* 2023-07-08 12:13:39 -04:00
Wing Lian
d69da99c2c skip explicit model type too if using trust_remote_code 2023-07-07 21:33:11 -04:00
Wing Lian
66afb76a15 don't use llama if trust_remote_code is set since that needs to use AutoModel path 2023-07-07 21:31:02 -04:00
Wing Lian
b9b7d4ce92 Merge pull request #221 from utensil/local_dataset
[WIP] Support loading data files from a local directory
2023-07-03 09:10:13 -04:00
NanoCode012
e79c8e617e Fix future deprecation push_to_hub_model_id 2023-07-03 12:44:29 +09:00
Wing Lian
1e5014acec Merge pull request #255 from OpenAccess-AI-Collective/open-orca-prompts
open orca support
2023-07-01 01:11:23 -04:00
Wing Lian
4066c78631 Merge pull request #246 from OpenAccess-AI-Collective/sys-prompts-instruct
add option for instruct w sys prompts
2023-07-01 00:27:29 -04:00
Wing Lian
78a1e1fa12 open orca support 2023-07-01 00:19:41 -04:00
NanoCode012
77bdb7d144 Fix typing list 2023-06-29 14:29:55 +09:00