Wing Lian
5f1d98e8fc
add e2e tests for Unsloth qlora and test the builds ( #2093 )
...
* see if unsloth installs cleanly in ci
* check unsloth install on regular tests, not sdist
* fix ampere check exception for ci
* use cached_property instead
* add an e2e test for unsloth qlora
* reduce seq len and mbsz to prevent oom in ci
* add checks for fp16 and sdp_attention
* pin unsloth to a specific release
* add unsloth to docker image too
* fix flash attn xentropy patch
* fix loss, add check for loss when using fa_xentropy
* fix special tokens for test
* typo
* test fa xentropy with and without gradient accum
* pr feedback changes
2024-11-29 20:38:49 -05:00
Wing Lian
1cf7075d18
support seperate lr for embeddings, similar to loraplus ( #1910 ) [skip ci]
...
* support seperate lr for embeddings, similar to loraplus
* add test case for train w lr embedding scale
* use kwarg for optimizer
* make sure to handle the optimizer creation
* make sure to handle for embedding_lr too
* use smollm for e2e, check for embeddings lr first before wdecay
2024-11-29 20:38:20 -05:00
NanoCode012
f4cabc2351
fix: ds3 and fsdp lmbench eval ( #2102 ) [ski[p ci]
...
* fix: ds3 and fsdp lmbench eval
* chore: update comment
* fix: test signature
2024-11-29 20:37:49 -05:00
Wing Lian
6e0fb4a6b2
add finetome dataset to fixtures, check eval_loss in test ( #2106 ) [skip ci]
...
* add finetome dataset to fixtures, check eval_loss in test
* add qwen 0.5b to pytest session fixture
2024-11-29 20:37:32 -05:00
Wing Lian
724b660d56
move shared pytest conftest to top level tests ( #2099 ) [skip ci]
...
* move shared pytest conftest to top level tests
* add __init__ so mypy doesn't choke on multiple conftests
2024-11-22 15:05:42 -05:00
Aman Karmani
51c9e1a035
.gitignore improvements ( #349 ) [skip ci]
2024-11-22 11:08:54 -05:00
Sunny Liu
45c0825587
updated colab notebook ( #2074 )
...
* updated colab notebook
* update pip installtation
* cleared cell output
* Update examples/colab-notebooks/colab-axolotl-example.ipynb
Co-authored-by: NanoCode012 <nano@axolotl.ai >
* Update examples/colab-notebooks/colab-axolotl-example.ipynb
Co-authored-by: NanoCode012 <nano@axolotl.ai >
* Update examples/colab-notebooks/colab-axolotl-example.ipynb
Co-authored-by: NanoCode012 <nano@axolotl.ai >
* Update examples/colab-notebooks/colab-axolotl-example.ipynb
Co-authored-by: NanoCode012 <nano@axolotl.ai >
* modified notebook
* Update examples/colab-notebooks/colab-axolotl-example.ipynb
Co-authored-by: NanoCode012 <nano@axolotl.ai >
* Update examples/colab-notebooks/colab-axolotl-example.ipynb
Co-authored-by: NanoCode012 <nano@axolotl.ai >
* Update examples/colab-notebooks/colab-axolotl-example.ipynb
Co-authored-by: NanoCode012 <nano@axolotl.ai >
* Update examples/colab-notebooks/colab-axolotl-example.ipynb
Co-authored-by: NanoCode012 <nano@axolotl.ai >
* Update examples/colab-notebooks/colab-axolotl-example.ipynb
Co-authored-by: NanoCode012 <nano@axolotl.ai >
* Update examples/colab-notebooks/colab-axolotl-example.ipynb
Co-authored-by: NanoCode012 <nano@axolotl.ai >
* cleared cell output
* cleared unnecessary logs
---------
Co-authored-by: NanoCode012 <nano@axolotl.ai >
2024-11-22 10:09:10 -05:00
Wing Lian
94fc223f6c
actions/create-release is unmaintained, and doesn't create proper release notes ( #2098 ) [skip ci]
2024-11-21 14:32:41 -05:00
Sunny Liu
151abb7a67
fix None-type not iterable error when deepspeed is left blank w/ use_… ( #2087 )
...
* fix None-type not iterable error when deepspeed is left blank w/ use_reentrant: false and qlora
* added unit test[skip e2e]
* corrected test case[skip e2e]
* assert warning message [skip e2e]
* assert warning message [skip e2e]
* corrected test cases [skip e2e]
* lint
2024-11-21 13:36:51 -05:00
Sunny Liu
bf416bdfd0
bump_liger_0.4.2 ( #2096 )
2024-11-21 13:24:52 -05:00
Mengqing Cao
838b74d05b
Add Ascend NPU support ( #1758 )
2024-11-20 21:28:41 -05:00
Wing Lian
2e99bb303e
fix inference when no chat_template is set, fix unsloth dora check ( #2092 )
...
* fix inference when no chat_template is set, fix unsloth dora check
* remove old unsloth version check
* update docs on installing unsloth
2024-11-20 14:07:54 -05:00
Chirag Jain
68a26f1005
Fix duplication of plugin callbacks ( #2090 )
2024-11-20 14:06:08 -05:00
Wing Lian
db51a9e4cb
use pep440 instead of semver ( #2088 ) [skip ci]
2024-11-19 15:02:10 -05:00
Wing Lian
8961364bc9
release 0.5.2 ( #2086 )
2024-11-19 12:44:42 -05:00
Wing Lian
e9c3a2aec0
add missing dunder-init for monkeypatches and add tests for install from sdist ( #2085 )
...
ci-cd / build-axolotl (<nil>, 124, 12.4.1, 3.11, 2.4.1) (push) Has been cancelled
ci-cd / build-axolotl (<nil>, 124, 12.4.1, 3.11, 2.5.1) (push) Has been cancelled
ci-cd / build-axolotl (mamba-ssm, 121, 12.1.1, 3.10, 2.3.1) (push) Has been cancelled
ci-cd / build-axolotl (mamba-ssm, 121, 12.1.1, true, 3.11, 2.3.1) (push) Has been cancelled
publish pypi / Create Release (push) Has been cancelled
ci-cd / build-axolotl-cloud (<nil>, 121, 12.1.1, 3.10, 2.3.1) (push) Has been cancelled
ci-cd / build-axolotl-cloud (<nil>, 121, 12.1.1, true, 3.11, 2.3.1) (push) Has been cancelled
ci-cd / build-axolotl-cloud (<nil>, 124, 12.4.1, 3.11, 2.4.1) (push) Has been cancelled
ci-cd / build-axolotl-cloud (<nil>, 124, 12.4.1, 3.11, 2.5.1) (push) Has been cancelled
ci-cd / build-axolotl-cloud-no-tmux (<nil>, 121, 12.1.1, 3.11, 2.3.1) (push) Has been cancelled
publish pypi / Upload release to PyPI (push) Has been cancelled
* add missing dunder-init for monkeypatches and add tests for install from sdist
* fix gha name
* reduce matrix for sdist test
v0.5.2
2024-11-19 12:43:30 -05:00
Wing Lian
02ca3f93b0
set manifest and fix for source dist ( #2084 )
ci-cd / build-axolotl (<nil>, 124, 12.4.1, 3.11, 2.4.1) (push) Has been cancelled
ci-cd / build-axolotl (<nil>, 124, 12.4.1, 3.11, 2.5.1) (push) Has been cancelled
ci-cd / build-axolotl (mamba-ssm, 121, 12.1.1, 3.10, 2.3.1) (push) Has been cancelled
ci-cd / build-axolotl (mamba-ssm, 121, 12.1.1, true, 3.11, 2.3.1) (push) Has been cancelled
publish pypi / Create Release (push) Has been cancelled
ci-cd / build-axolotl-cloud (<nil>, 121, 12.1.1, 3.10, 2.3.1) (push) Has been cancelled
ci-cd / build-axolotl-cloud (<nil>, 121, 12.1.1, true, 3.11, 2.3.1) (push) Has been cancelled
ci-cd / build-axolotl-cloud (<nil>, 124, 12.4.1, 3.11, 2.4.1) (push) Has been cancelled
ci-cd / build-axolotl-cloud (<nil>, 124, 12.4.1, 3.11, 2.5.1) (push) Has been cancelled
ci-cd / build-axolotl-cloud-no-tmux (<nil>, 121, 12.1.1, 3.11, 2.3.1) (push) Has been cancelled
publish pypi / Upload release to PyPI (push) Has been cancelled
v0.5.1.post1
2024-11-19 11:31:56 -05:00
Wing Lian
5f6f9186e4
make sure action has permission to create release ( #2083 ) [skip ci]
2024-11-19 10:43:02 -05:00
Wing Lian
6679e20f47
release version 0.5.1 ( #2082 )
2024-11-19 10:35:59 -05:00
Wing Lian
ec59d4cb83
remove deprecated extra metadata kwarg from pydantic Field ( #2081 ) [skip ci]
2024-11-19 10:30:10 -05:00
Wing Lian
a77c8a71cf
fix brackets on docker ci builds, add option to skip e2e builds [skip e2e] ( #2080 ) [skip ci]
2024-11-19 10:29:31 -05:00
Wing Lian
775311f98f
add optimizer step to prevent warning in tests ( #1502 ) [skip ci]
...
* add optimizer step to prevent warning in tests
* add optimizer step to warmup as well
2024-11-19 10:19:03 -05:00
NanoCode012
f007c38e49
Feat: Drop long samples and shuffle rl samples ( #2040 ) [skip ci]
...
* feat: LOG warn if samples are dropped due to seq length
* feat: add drop long samples for RL
* feat: add ipo
* fix: remove num_proc for map as subprocesses are prone to die
* feat: shuffle rl dataset
* fix: support preprocess for kto
* chore: use set instead of list
* feat: add simpo
2024-11-19 10:18:24 -05:00
Wing Lian
d9b71edf84
bump transformers for fsdp-grad-accum fix, remove patch ( #2079 )
2024-11-19 02:23:09 -05:00
Wing Lian
c07bd2fa65
Readme updates v2 ( #2078 )
...
* update readme logos
* use full logo
* Fix svgs
* add srcset
* resize svgs to match
* Rename file
* align badges center
2024-11-18 14:58:03 -05:00
Wing Lian
ed079d434a
static assets, readme, and badges update v1 ( #2077 )
2024-11-18 13:59:32 -05:00
Wing Lian
8403c67156
don't build bdist ( #2076 ) [skip ci]
2024-11-18 12:36:03 -05:00
Wing Lian
9871fa060b
optim e2e tests to run a bit faster ( #2069 ) [skip ci]
...
* optim e2e tests to run a bit faster
* run prequant w/o lora_modules_to_save
* use smollm2
2024-11-18 12:35:31 -05:00
Wing Lian
70cf79ef52
upgrade autoawq==0.2.7.post2 for transformers fix ( #2070 )
...
* point to upstream autoawq for transformers fix
* use autoawq 0.2.7 release
* test wheel for awq
* try different format for wheel def
* autoawq re-release
* Add intel_extension_for_pytorch dep
* ipex gte version
* forcefully remove intel-extension-for-pytorch
* add -y option to pip uninstall for ipex
* use post2 release for autoawq and remove uninstall of ipex
2024-11-18 11:53:37 -05:00
Wing Lian
c06b8f0243
increase worker count to 8 for basic pytests ( #2075 ) [skip ci]
2024-11-18 11:52:35 -05:00
Chirag Jain
0c8b1d824a
Update get_unpad_data patching for multipack ( #2013 )
...
* Update `get_unpad_data` patching for multipack
* Update src/axolotl/utils/models.py
* Update src/axolotl/utils/models.py
* Add test case
---------
Co-authored-by: Wing Lian <wing.lian@gmail.com >
Co-authored-by: Wing Lian <wing@axolotl.ai >
2024-11-15 20:35:50 -05:00
NanoCode012
fd70eec577
fix: loading locally downloaded dataset ( #2056 ) [skip ci]
2024-11-15 20:35:26 -05:00
Wing Lian
d42f202046
Fsdp grad accum monkeypatch ( #2064 )
2024-11-15 19:11:04 -05:00
Wing Lian
0dabde1962
support for schedule free and e2e ci smoke test ( #2066 ) [skip ci]
...
* support for schedule free and e2e ci smoke test
* set default lr scheduler to constant in test
* ignore duplicate code
* fix quotes for config/dict
2024-11-15 19:10:14 -05:00
Wing Lian
15f1462ccd
support passing trust_remote_code to dataset loading ( #2050 ) [skip ci]
...
* support passing trust_remote_code to dataset loading
* add doc for trust_remote_code in dataset config
2024-11-15 19:09:48 -05:00
Wing Lian
521e62daf1
remove the bos token from dpo outputs ( #1733 ) [skip ci]
...
* remove the bos token from dpo outputs
* don't forget to fix prompt_input_ids too
* use processing_class instead of tokenizer
* fix for processing class
2024-11-15 19:09:20 -05:00
Wing Lian
c16ec398d7
update to be deprecated evaluation_strategy ( #1682 ) [skip ci]
...
* update to be deprecated evaluation_strategy and c4 dataset
* chore: lint
* remap eval strategy to new config and add tests
2024-11-15 19:09:00 -05:00
Wing Lian
2f20cb7ebf
upgrade datasets==3.1.0 and add upstream check ( #2067 ) [skip ci]
2024-11-15 19:08:38 -05:00
Wing Lian
71d4030b79
gradient accumulation tests, embeddings w pad_token fix, smaller models ( #2059 )
...
* add more test cases for gradient accumulation and fix zero3
* swap out for smaller model
* fix missing return
* fix missing pad_token in config
* support concurrency for multigpu testing
* cast empty deepspeed to empty string for zero3 check
* fix temp_dir as fixture so parametrize works properly
* fix test file for multigpu evals
* don't use default
* don't use default for fsdp_state_dict_type
* don't use llama tokenizer w smollm
* also automatically cancel multigpu for concurrency
2024-11-14 12:59:00 -05:00
Wing Lian
f3a5d119af
fix env var extraction ( #2043 ) [skip ci]
2024-11-14 12:58:06 -05:00
Wing Lian
ba219b51a5
fix duplicate base build ( #2061 ) [skip ci]
2024-11-14 10:31:19 -05:00
Wing Lian
5be8e13d35
make sure to add tags for versioned tag on cloud docker images ( #2060 )
2024-11-14 10:24:49 -05:00
Wing Lian
2d7830fda6
upgrade to flash-attn 2.7.0 ( #2048 )
2024-11-14 06:59:25 -05:00
Wing Lian
5e98cdddac
Grokfast support ( #1917 )
2024-11-13 17:10:36 -05:00
Sunny Liu
1d7aee0ad2
ADOPT optimizer integration ( #2032 ) [skip ci]
...
* adopt integration
* stuff
* doc and test for ADOPT
* rearrangement
* fixed formatting
* hacking pre-commit
* chore: lint
* update module doc for adopt optimizer
* remove un-necessary example yaml for adopt optimizer
* skip test adopt if torch<2.5.1
* formatting
* use version.parse
* specifies required torch version for adopt_adamw
---------
Co-authored-by: sunny <sunnyliu19981005@gmail.com >
Co-authored-by: Wing Lian <wing@axolotl.ai >
2024-11-13 17:10:17 -05:00
Wing Lian
659ee5d723
don't cancel the tests on main automatically for concurrency ( #2055 ) [skip ci]
2024-11-13 17:07:41 -05:00
Sunny Liu
342935cff3
Update unsloth for torch.cuda.amp deprecation ( #2042 )
...
* update deprecated unsloth tirch cuda amp decorator
* WIP fix torch.cuda.amp deprecation
* lint
* laxing torch version requirement
* remove use of partial
* remove use of partial
* lint
---------
Co-authored-by: sunny <sunnyliu19981005@gmail.com >
2024-11-13 15:17:34 -05:00
Wing Lian
c5eb9ea2c2
fix push to main and tag semver build for docker ci ( #2054 )
2024-11-13 14:04:28 -05:00
Wing Lian
f2145a3ccb
add default torch version if not installed, and support for xformers new wheels ( #2049 )
2024-11-13 13:16:47 -05:00
Wing Lian
010d0e7ff3
retry flaky test_packing_stream_dataset test that timesout on read ( #2052 ) [skip ci]
2024-11-13 13:16:16 -05:00