bursteratom
4698eed43f
set pixtral chat template
2024-12-04 12:11:21 -05:00
bursteratom
f84c3b37e7
lint
2024-12-04 11:59:45 -05:00
bursteratom
c39971c659
stuff
2024-11-27 10:52:36 -05:00
bursteratom
33a178c788
val config pixtral chat template
2024-11-27 10:36:23 -05:00
bursteratom
db15605e7e
pixral chat template
2024-11-27 10:34:19 -05:00
bursteratom
9e112bc8b5
lint
2024-11-27 10:33:35 -05:00
bursteratom
e038410778
lint
2024-11-27 10:24:37 -05:00
bursteratom
f4385c3cf4
add special tokens
2024-11-27 10:18:45 -05:00
bursteratom
d58c772df6
pixtral flash-attn false
2024-11-27 10:16:17 -05:00
bursteratom
69265a53b5
stuff
2024-11-27 09:53:41 -05:00
Wing Lian
724b660d56
move shared pytest conftest to top level tests ( #2099 ) [skip ci]
...
* move shared pytest conftest to top level tests
* add __init__ so mypy doesn't choke on multiple conftests
2024-11-22 15:05:42 -05:00
Aman Karmani
51c9e1a035
.gitignore improvements ( #349 ) [skip ci]
2024-11-22 11:08:54 -05:00
Sunny Liu
45c0825587
updated colab notebook ( #2074 )
...
* updated colab notebook
* update pip installtation
* cleared cell output
* Update examples/colab-notebooks/colab-axolotl-example.ipynb
Co-authored-by: NanoCode012 <nano@axolotl.ai >
* Update examples/colab-notebooks/colab-axolotl-example.ipynb
Co-authored-by: NanoCode012 <nano@axolotl.ai >
* Update examples/colab-notebooks/colab-axolotl-example.ipynb
Co-authored-by: NanoCode012 <nano@axolotl.ai >
* Update examples/colab-notebooks/colab-axolotl-example.ipynb
Co-authored-by: NanoCode012 <nano@axolotl.ai >
* modified notebook
* Update examples/colab-notebooks/colab-axolotl-example.ipynb
Co-authored-by: NanoCode012 <nano@axolotl.ai >
* Update examples/colab-notebooks/colab-axolotl-example.ipynb
Co-authored-by: NanoCode012 <nano@axolotl.ai >
* Update examples/colab-notebooks/colab-axolotl-example.ipynb
Co-authored-by: NanoCode012 <nano@axolotl.ai >
* Update examples/colab-notebooks/colab-axolotl-example.ipynb
Co-authored-by: NanoCode012 <nano@axolotl.ai >
* Update examples/colab-notebooks/colab-axolotl-example.ipynb
Co-authored-by: NanoCode012 <nano@axolotl.ai >
* Update examples/colab-notebooks/colab-axolotl-example.ipynb
Co-authored-by: NanoCode012 <nano@axolotl.ai >
* cleared cell output
* cleared unnecessary logs
---------
Co-authored-by: NanoCode012 <nano@axolotl.ai >
2024-11-22 10:09:10 -05:00
Wing Lian
94fc223f6c
actions/create-release is unmaintained, and doesn't create proper release notes ( #2098 ) [skip ci]
2024-11-21 14:32:41 -05:00
Sunny Liu
151abb7a67
fix None-type not iterable error when deepspeed is left blank w/ use_… ( #2087 )
...
* fix None-type not iterable error when deepspeed is left blank w/ use_reentrant: false and qlora
* added unit test[skip e2e]
* corrected test case[skip e2e]
* assert warning message [skip e2e]
* assert warning message [skip e2e]
* corrected test cases [skip e2e]
* lint
2024-11-21 13:36:51 -05:00
Sunny Liu
bf416bdfd0
bump_liger_0.4.2 ( #2096 )
2024-11-21 13:24:52 -05:00
Mengqing Cao
838b74d05b
Add Ascend NPU support ( #1758 )
2024-11-20 21:28:41 -05:00
Wing Lian
2e99bb303e
fix inference when no chat_template is set, fix unsloth dora check ( #2092 )
...
* fix inference when no chat_template is set, fix unsloth dora check
* remove old unsloth version check
* update docs on installing unsloth
2024-11-20 14:07:54 -05:00
Chirag Jain
68a26f1005
Fix duplication of plugin callbacks ( #2090 )
2024-11-20 14:06:08 -05:00
Wing Lian
db51a9e4cb
use pep440 instead of semver ( #2088 ) [skip ci]
2024-11-19 15:02:10 -05:00
Wing Lian
8961364bc9
release 0.5.2 ( #2086 )
2024-11-19 12:44:42 -05:00
Wing Lian
e9c3a2aec0
add missing dunder-init for monkeypatches and add tests for install from sdist ( #2085 )
...
ci-cd / build-axolotl (<nil>, 124, 12.4.1, 3.11, 2.4.1) (push) Has been cancelled
ci-cd / build-axolotl (<nil>, 124, 12.4.1, 3.11, 2.5.1) (push) Has been cancelled
ci-cd / build-axolotl (mamba-ssm, 121, 12.1.1, 3.10, 2.3.1) (push) Has been cancelled
ci-cd / build-axolotl (mamba-ssm, 121, 12.1.1, true, 3.11, 2.3.1) (push) Has been cancelled
publish pypi / Create Release (push) Has been cancelled
ci-cd / build-axolotl-cloud (<nil>, 121, 12.1.1, 3.10, 2.3.1) (push) Has been cancelled
ci-cd / build-axolotl-cloud (<nil>, 121, 12.1.1, true, 3.11, 2.3.1) (push) Has been cancelled
ci-cd / build-axolotl-cloud (<nil>, 124, 12.4.1, 3.11, 2.4.1) (push) Has been cancelled
ci-cd / build-axolotl-cloud (<nil>, 124, 12.4.1, 3.11, 2.5.1) (push) Has been cancelled
ci-cd / build-axolotl-cloud-no-tmux (<nil>, 121, 12.1.1, 3.11, 2.3.1) (push) Has been cancelled
publish pypi / Upload release to PyPI (push) Has been cancelled
* add missing dunder-init for monkeypatches and add tests for install from sdist
* fix gha name
* reduce matrix for sdist test
v0.5.2
2024-11-19 12:43:30 -05:00
Wing Lian
02ca3f93b0
set manifest and fix for source dist ( #2084 )
ci-cd / build-axolotl (<nil>, 124, 12.4.1, 3.11, 2.4.1) (push) Has been cancelled
ci-cd / build-axolotl (<nil>, 124, 12.4.1, 3.11, 2.5.1) (push) Has been cancelled
ci-cd / build-axolotl (mamba-ssm, 121, 12.1.1, 3.10, 2.3.1) (push) Has been cancelled
ci-cd / build-axolotl (mamba-ssm, 121, 12.1.1, true, 3.11, 2.3.1) (push) Has been cancelled
publish pypi / Create Release (push) Has been cancelled
ci-cd / build-axolotl-cloud (<nil>, 121, 12.1.1, 3.10, 2.3.1) (push) Has been cancelled
ci-cd / build-axolotl-cloud (<nil>, 121, 12.1.1, true, 3.11, 2.3.1) (push) Has been cancelled
ci-cd / build-axolotl-cloud (<nil>, 124, 12.4.1, 3.11, 2.4.1) (push) Has been cancelled
ci-cd / build-axolotl-cloud (<nil>, 124, 12.4.1, 3.11, 2.5.1) (push) Has been cancelled
ci-cd / build-axolotl-cloud-no-tmux (<nil>, 121, 12.1.1, 3.11, 2.3.1) (push) Has been cancelled
publish pypi / Upload release to PyPI (push) Has been cancelled
v0.5.1.post1
2024-11-19 11:31:56 -05:00
Wing Lian
5f6f9186e4
make sure action has permission to create release ( #2083 ) [skip ci]
2024-11-19 10:43:02 -05:00
Wing Lian
6679e20f47
release version 0.5.1 ( #2082 )
2024-11-19 10:35:59 -05:00
Wing Lian
ec59d4cb83
remove deprecated extra metadata kwarg from pydantic Field ( #2081 ) [skip ci]
2024-11-19 10:30:10 -05:00
Wing Lian
a77c8a71cf
fix brackets on docker ci builds, add option to skip e2e builds [skip e2e] ( #2080 ) [skip ci]
2024-11-19 10:29:31 -05:00
Wing Lian
775311f98f
add optimizer step to prevent warning in tests ( #1502 ) [skip ci]
...
* add optimizer step to prevent warning in tests
* add optimizer step to warmup as well
2024-11-19 10:19:03 -05:00
NanoCode012
f007c38e49
Feat: Drop long samples and shuffle rl samples ( #2040 ) [skip ci]
...
* feat: LOG warn if samples are dropped due to seq length
* feat: add drop long samples for RL
* feat: add ipo
* fix: remove num_proc for map as subprocesses are prone to die
* feat: shuffle rl dataset
* fix: support preprocess for kto
* chore: use set instead of list
* feat: add simpo
2024-11-19 10:18:24 -05:00
Wing Lian
d9b71edf84
bump transformers for fsdp-grad-accum fix, remove patch ( #2079 )
2024-11-19 02:23:09 -05:00
Wing Lian
c07bd2fa65
Readme updates v2 ( #2078 )
...
* update readme logos
* use full logo
* Fix svgs
* add srcset
* resize svgs to match
* Rename file
* align badges center
2024-11-18 14:58:03 -05:00
Wing Lian
ed079d434a
static assets, readme, and badges update v1 ( #2077 )
2024-11-18 13:59:32 -05:00
Wing Lian
8403c67156
don't build bdist ( #2076 ) [skip ci]
2024-11-18 12:36:03 -05:00
Wing Lian
9871fa060b
optim e2e tests to run a bit faster ( #2069 ) [skip ci]
...
* optim e2e tests to run a bit faster
* run prequant w/o lora_modules_to_save
* use smollm2
2024-11-18 12:35:31 -05:00
Wing Lian
70cf79ef52
upgrade autoawq==0.2.7.post2 for transformers fix ( #2070 )
...
* point to upstream autoawq for transformers fix
* use autoawq 0.2.7 release
* test wheel for awq
* try different format for wheel def
* autoawq re-release
* Add intel_extension_for_pytorch dep
* ipex gte version
* forcefully remove intel-extension-for-pytorch
* add -y option to pip uninstall for ipex
* use post2 release for autoawq and remove uninstall of ipex
2024-11-18 11:53:37 -05:00
Wing Lian
c06b8f0243
increase worker count to 8 for basic pytests ( #2075 ) [skip ci]
2024-11-18 11:52:35 -05:00
Chirag Jain
0c8b1d824a
Update get_unpad_data patching for multipack ( #2013 )
...
* Update `get_unpad_data` patching for multipack
* Update src/axolotl/utils/models.py
* Update src/axolotl/utils/models.py
* Add test case
---------
Co-authored-by: Wing Lian <wing.lian@gmail.com >
Co-authored-by: Wing Lian <wing@axolotl.ai >
2024-11-15 20:35:50 -05:00
NanoCode012
fd70eec577
fix: loading locally downloaded dataset ( #2056 ) [skip ci]
2024-11-15 20:35:26 -05:00
Wing Lian
d42f202046
Fsdp grad accum monkeypatch ( #2064 )
2024-11-15 19:11:04 -05:00
Wing Lian
0dabde1962
support for schedule free and e2e ci smoke test ( #2066 ) [skip ci]
...
* support for schedule free and e2e ci smoke test
* set default lr scheduler to constant in test
* ignore duplicate code
* fix quotes for config/dict
2024-11-15 19:10:14 -05:00
Wing Lian
15f1462ccd
support passing trust_remote_code to dataset loading ( #2050 ) [skip ci]
...
* support passing trust_remote_code to dataset loading
* add doc for trust_remote_code in dataset config
2024-11-15 19:09:48 -05:00
Wing Lian
521e62daf1
remove the bos token from dpo outputs ( #1733 ) [skip ci]
...
* remove the bos token from dpo outputs
* don't forget to fix prompt_input_ids too
* use processing_class instead of tokenizer
* fix for processing class
2024-11-15 19:09:20 -05:00
Wing Lian
c16ec398d7
update to be deprecated evaluation_strategy ( #1682 ) [skip ci]
...
* update to be deprecated evaluation_strategy and c4 dataset
* chore: lint
* remap eval strategy to new config and add tests
2024-11-15 19:09:00 -05:00
Wing Lian
2f20cb7ebf
upgrade datasets==3.1.0 and add upstream check ( #2067 ) [skip ci]
2024-11-15 19:08:38 -05:00
Wing Lian
71d4030b79
gradient accumulation tests, embeddings w pad_token fix, smaller models ( #2059 )
...
* add more test cases for gradient accumulation and fix zero3
* swap out for smaller model
* fix missing return
* fix missing pad_token in config
* support concurrency for multigpu testing
* cast empty deepspeed to empty string for zero3 check
* fix temp_dir as fixture so parametrize works properly
* fix test file for multigpu evals
* don't use default
* don't use default for fsdp_state_dict_type
* don't use llama tokenizer w smollm
* also automatically cancel multigpu for concurrency
2024-11-14 12:59:00 -05:00
Wing Lian
f3a5d119af
fix env var extraction ( #2043 ) [skip ci]
2024-11-14 12:58:06 -05:00
Wing Lian
ba219b51a5
fix duplicate base build ( #2061 ) [skip ci]
2024-11-14 10:31:19 -05:00
Wing Lian
5be8e13d35
make sure to add tags for versioned tag on cloud docker images ( #2060 )
2024-11-14 10:24:49 -05:00
Wing Lian
2d7830fda6
upgrade to flash-attn 2.7.0 ( #2048 )
2024-11-14 06:59:25 -05:00
Wing Lian
5e98cdddac
Grokfast support ( #1917 )
2024-11-13 17:10:36 -05:00