Wing Lian
7771498eae
add guassian dropout support
2023-09-25 14:50:39 -04:00
Fernando Tarin Morales
5e5296a77c
Added quotes to the pip install -e command to fix an incompatibility with shells that do glob expansion like zsh ( #632 )
2023-09-25 11:50:14 -04:00
mhenrichsen
f3d939016a
Merge pull request #629 from OpenAccess-AI-Collective/chore/-change-default-model
...
default model changed
2023-09-25 09:32:01 +02:00
NanoCode012
cfbce020e9
Fix: Fail bf16 check when running on cpu during merge ( #631 )
2023-09-25 13:48:18 +09:00
mhenrichsen
4fecbfe5e1
default model changed
2023-09-24 18:52:53 +02:00
NanoCode012
67b9888630
Feat(doc): Add eval_sample_packing to doc ( #625 )
2023-09-23 13:11:27 +09:00
Maxime
923eb91304
tweak: improve base builder for smaller layers ( #500 )
2023-09-22 16:17:50 -04:00
Wing Lian
a363604dcf
better handling and logging of empty sharegpt turns ( #603 )
2023-09-22 16:13:42 -04:00
Wing Lian
501958bb6f
create a model card with axolotl badge ( #624 )
2023-09-22 16:13:26 -04:00
Wing Lian
c25ba7939b
update README w deepspeed info ( #605 )
2023-09-22 00:15:52 -04:00
NanoCode012
d5f8589021
chore(callback): Remove old peft saving code ( #510 )
2023-09-22 12:31:33 +09:00
Wing Lian
03e59077a0
misc fixes to add gptq tests ( #621 )
...
* misc fixes to add gptq tests
* set bf16 needed for fa2
2023-09-21 21:52:12 -04:00
Wing Lian
97d3776ce6
split completion text to sequence_len ( #616 )
2023-09-21 21:51:25 -04:00
Wing Lian
2844eb22b6
run eval on the first step to get a baseline ( #617 )
...
* run eval on the first step to get a baseline
* wandb kleeps getting moved around by pre-commit ...
2023-09-21 21:51:09 -04:00
Wing Lian
e85d2eb06b
let MAX_JOBS use the default since we're not resource constrained on our self-hosted runners ( #427 )
2023-09-21 20:36:30 -04:00
Wing Lian
196ff1181e
skip the gpu memory checks if the device is set to 'auto' ( #609 )
...
* skip the gpu memory checks if the device is set to 'auto'
* skip gpu mem logging if cpu too
* don't worry about log_gpu_memory_usage since it calls another annotated fn
* rename decorator internal
2023-09-21 15:20:31 -04:00
Wing Lian
92512c390b
ignore wandb to resolve isort headaches ( #619 )
2023-09-21 11:50:09 -04:00
Maxime
2fe95cdcc1
fix distributed devices ( #612 )
...
* fix distributed devices
* Update distributed.py
* Update distributed.py
2023-09-21 09:11:34 -04:00
Maxime
c1382e79b6
Create multi-node.md ( #613 )
...
* Create multi-node.md
* Update multi-node.md
* Update multi-node.md
2023-09-20 22:02:16 -04:00
Maxime
5d931cc042
Only run tests when a change to python files is made ( #614 )
...
* Update tests.yml
* Update .github/workflows/tests.yml
---------
Co-authored-by: Wing Lian <wing.lian@gmail.com >
2023-09-20 22:02:04 -04:00
Javier
ec0958f4f8
Update requirements.txt ( #610 )
2023-09-20 08:40:49 -04:00
Wing Lian
faecff9798
support to disable exllama for gptq ( #604 )
...
* support to disable exllama for gptq
* update property instead of item
* fix config key
2023-09-19 17:51:08 -04:00
bofeng huang
aa656e04bd
Delete duplicate lines ( #606 )
2023-09-19 16:40:05 -04:00
Wing Lian
b53e77775b
update dockerfile to not build evoformer since it fails the build ( #607 )
2023-09-19 16:28:29 -04:00
Wing Lian
674c57692d
more sane defaults for openllama 3b used for quickstarts ( #602 )
...
* more sane defaults for openllama 3b used for quickstarts
* don't use bf16 for quickstart to simplify gpu compatibility
* use the update openlm-research/open_llama_3b_v2 models
2023-09-19 09:15:10 -04:00
Wing Lian
1eebbd09c3
improve handling for empty text on the tokenization step ( #502 )
2023-09-19 08:09:56 -04:00
Wing Lian
62a774140b
Fix for check with cfg and merge_lora ( #600 )
2023-09-18 21:14:32 -04:00
Wing Lian
31b9e0c6e8
minor tweaks to simplify ( #597 )
2023-09-18 11:45:44 -04:00
Wing Lian
6b9b229356
btlm and falcon monkey patches for flash attn ( #566 )
2023-09-17 13:49:18 -04:00
Wing Lian
131afdbd89
add bf16 check ( #587 )
2023-09-17 13:49:03 -04:00
NanoCode012
00dce35fb2
Feat(data): Allow loading local csv and text ( #594 )
...
* Feat(data): Allow loading local csv and text
* chore: update readme for loading data
2023-09-17 11:32:27 -04:00
Wing Lian
b15b19eb8d
gather/broadcast the max value of the packing efficiency automatically ( #463 )
2023-09-17 11:08:18 -04:00
Wing Lian
ab534d75ba
don't add position_ids for evals ( #591 )
2023-09-16 16:11:57 -04:00
Wing Lian
21ec195c9f
optionally configure sample packing for evals ( #589 )
2023-09-16 00:09:48 -04:00
Wing Lian
62eaee7649
make phi training work with Loras ( #588 )
...
* valdiation for phi loras
* fix model config class check
* update readme for phi traiing
2023-09-15 20:51:55 -04:00
Jan Philipp Harries
be75668400
set fsdp state dict ( #584 )
...
Co-authored-by: Jan Philipp Harries <jphme@users.noreply.github.com >
2023-09-15 17:47:36 -04:00
Wing Lian
aeec7c4688
pop block_cls since it's not an actual kwarg
2023-09-15 15:54:06 -04:00
Wing Lian
360788296a
don't resize embeddings if it's already large enough ( #577 )
...
* don't resize embeddings if it's already large enough
* make sure to tie weights, even if we aren't resizing
2023-09-15 15:47:09 -04:00
Wing Lian
12a2dbbc2c
Support Sample packing for phi arch ( #586 )
...
* phi sequence packing
* sample packing fixes
* fix linting
* fix inference and phi e2e tests
* update phi example now that sample packing works
* wandb import keeps getting moved around
2023-09-15 15:46:54 -04:00
NanoCode012
3a2edc85c3
Feat(doc): Add features to doc ( #583 )
2023-09-16 01:14:15 +09:00
Wing Lian
f7a22632d7
support custom field for completion from yml ( #580 )
...
* support custom field for completion from yml
* remove legacy completion check and add doc
* update README docs
2023-09-15 07:48:21 -04:00
Doan Minh Phuong
1aa400721e
Fix Codellama examples ( #582 )
...
* Fix seq_len
* Update lora.yml
* Update qlora.yml
* Update lora.yml
* Update lora.yml
* Update qlora.yml
2023-09-15 04:19:13 -04:00
Wing Lian
8dcd40ac78
prevent cli functions from getting fired on import ( #581 )
2023-09-15 04:03:32 -04:00
Wing Lian
a5a625f47e
update support matrix with btlm and phi ( #579 )
2023-09-15 02:46:15 -04:00
Wing Lian
861cecac2a
refactor scripts/finetune.py into new cli modules ( #550 )
...
* refactor scripts/finetune.py into new cli modules
* continue to support scripts/finetune.py
* update readme with updated cli commands
* Update scripts/finetune.py
Co-authored-by: NanoCode012 <kevinvong@rocketmail.com >
---------
Co-authored-by: NanoCode012 <kevinvong@rocketmail.com >
2023-09-15 01:43:52 -04:00
Wing Lian
1078d3eae7
E2e passing tests ( #576 )
...
* run e2e tests after all other checks have passed
* tweak tests so they get run on PRs or push to main
* change dependent action for chcecking
* one test workflow to rule them all
* no need for custom action, just use needs
* whoops, python version should be a string
* e2e tests can run on any available gpu
2023-09-15 01:03:49 -04:00
Wing Lian
24146733db
E2e device cuda ( #575 )
...
* use torch.cuda.current_device() instead of local_rank
* ignore NVML errors for gpu stats
* llama lora packing e2e tests
2023-09-14 22:49:27 -04:00
Wing Lian
9218ebecd2
e2e testing ( #574 )
2023-09-14 21:56:11 -04:00
Wing Lian
228420972e
Phi examples ( #569 )
...
* add phi full ft example
* Add readme to point out that deepspeed should be used
* zero1 is better than zero2 for phi
2023-09-14 11:17:47 -04:00
Wing Lian
c6d870b91d
mypy wandb ignore ( #572 )
...
* mypy wandb ignore
* fix isort for wandb
2023-09-14 11:17:30 -04:00