Wing Lian
e8d3da0081
upgrade pytorch from 2.4.0 => 2.4.1 ( #1950 )
...
* upgrade pytorch from 2.4.0 => 2.4.1
* update xformers for updated pytorch version
* handle xformers version case for torch==2.3.1
2024-10-09 11:53:56 -04:00
Wing Lian
4ca0a47cfb
add 2.4.1 to base models ( #1953 )
2024-10-09 08:43:11 -04:00
Wing Lian
3853ab7ae9
bump accelerate to 0.34.2 ( #1901 )
...
* bump accelerate
* add fixture to predownload the test model
* change fixture
2024-09-07 14:39:31 -04:00
Wing Lian
93b769a979
lint fix and update gha regex ( #1899 )
2024-09-05 09:58:21 -04:00
Wing Lian
3c6b9eda2e
run pytests with varied pytorch versions too ( #1883 )
2024-08-31 22:49:35 -04:00
Wing Lian
e8ff5d5738
don't mess with bnb since it needs compiled wheels ( #1859 )
2024-08-23 12:18:47 -04:00
Wing Lian
b33dc07a77
rename nightly test and add badge ( #1853 )
2024-08-22 13:13:33 -04:00
Wing Lian
dcbff16983
run nightly ci builds against upstream main ( #1851 )
...
* run nightly ci builds against upstream main
* add test badges
* run the multigpu tests against nightly main builds too
2024-08-22 13:10:54 -04:00
Wing Lian
54392ac8a6
Attempt to run multigpu in PR CI for now to ensure it works ( #1815 ) [skip ci]
...
* Attempt to run multigpu in PR CI for now to ensure it works
* fix yaml file
* forgot to include multigpu tests
* fix call to cicd.multigpu
* dump dictdefault to dict for yaml conversion
* use to_dict instead of casting
* 16bit-lora w flash attention, 8bit lora seems problematic
* add llama fsdp test
* more tests
* Add test for qlora + fsdp with prequant
* limit accelerate to 2 processes and disable broken qlora+fsdp+bnb test
* move multigpu tests to biweekly
2024-08-09 11:50:13 -04:00
Wing Lian
70978467a0
skip no commit to main on ci ( #1814 )
2024-08-06 15:25:54 -04:00
Wing Lian
dbf8fb549e
publish axolotl images without extras in the tag name ( #1798 )
2024-07-30 13:36:19 -04:00
Wing Lian
9a63884597
update test and main/nightly builds ( #1797 )
...
* update test and main/nightly builds
* don't install mamba-ssm on 2.4.0 since it has no wheels yet
2024-07-30 12:37:40 -04:00
Wing Lian
c5587b45ac
use 12.4.1 instead of 12.4 [skip-ci] ( #1796 )
2024-07-30 08:50:23 -04:00
Wing Lian
d4f6a6b103
fix dockerfile and base builder ( #1795 ) [skip-ci]
2024-07-30 08:34:37 -04:00
Wing Lian
d8d1788ffc
move to supporting mostly 12.1 w 2.3.1 and add new 12.4 with 2.4.0 ( #1793 )
2024-07-30 08:06:11 -04:00
Wing Lian
e1725aef2b
update modal package and don't cache pip install ( #1757 )
...
* update modal package and cleanup pip cache
* more verbosity on the test
2024-07-16 14:45:38 -04:00
Wing Lian
1e57b4c562
update to pytorch 2.3.1 ( #1746 ) [skip ci]
2024-07-13 13:28:17 -04:00
Wing Lian
137d84d1b4
add torch 2.3.1 base image ( #1745 )
2024-07-13 09:41:51 -04:00
Wing Lian
a159724e44
bump trl and accelerate for latest releases ( #1730 )
...
* bump trl and accelerate for latest releases
* ensure that the CI runs on new gh org
* drop kto_pair support since removed upstream
2024-07-10 11:15:44 -04:00
Wing Lian
ef223519c9
update deps ( #1663 ) [skip ci]
...
* update deps and tweak logic so axolotl is pip installable
* use vcs url format
* using dependency_links isn't supported per docs)
2024-05-28 11:23:34 -04:00
Wing Lian
60113437e4
cloud image w/o tmux ( #1628 )
2024-05-15 22:27:40 -04:00
Wing Lian
3319780300
update torch 2.2.1 -> 2.2.2 ( #1622 )
2024-05-15 09:45:27 -04:00
Wing Lian
70185763f6
add torch 2.3.0 to builds ( #1593 )
2024-05-05 18:45:45 -04:00
Wing Lian
c10563c444
fix broken linting ( #1541 )
...
* chore: lint
* include examples in yaml check
* mistral decided to gate their models...
* more mistral models that were gated
2024-04-19 01:03:04 -04:00
Wing Lian
4a92a3b9ee
Nightlies fix v4 ( #1458 ) [skip ci]
...
* another attempt at github actions
* try again
2024-03-29 11:04:34 -04:00
Wing Lian
46a73e3d1a
fix yaml parsing for workflow ( #1457 ) [skip ci]
2024-03-29 10:21:08 -04:00
Wing Lian
da3415bb5a
fix how nightly tag is generated ( #1456 ) [skip ci]
2024-03-29 09:29:17 -04:00
Wing Lian
8cb127abeb
configure nightly docker builds ( #1454 ) [skip ci]
...
* configure nightly docker builds
* also test update pytorch in modal ci
2024-03-29 08:25:45 -04:00
Wing Lian
05b398a072
fix some of the edge cases for Jamba ( #1452 )
...
* fix some of the edge cases for Jamba
* update requirements for jamba
2024-03-29 02:38:02 -04:00
Wing Lian
da265dd796
fix for accelerate env var for auto bf16, add new base image and expand torch_cuda_arch_list support ( #1413 )
2024-03-26 16:46:19 -04:00
Hamel Husain
4e69aa48ab
Update docs.yml
2024-03-21 22:36:57 -07:00
Hamel Husain
629450cecd
Bootstrap Hosted Axolotl Docs w/Quarto ( #1429 )
...
* precommit
* mv styes.css
* fix links
2024-03-21 22:28:36 -07:00
Wing Lian
7803f0934f
fixes for dpo and orpo template loading ( #1424 )
2024-03-20 11:36:24 -04:00
Wing Lian
00018629e7
run tests again on Modal ( #1289 ) [skip ci]
...
* run tests again on Modal
* make sure to run the full suite of tests on modal
* run cicd steps via shell script
* run tests in different runs
* increase timeout
* split tests into steps on modal
* increase workflow timeout
* retry doing this with only a single script
* fix yml launch for modal ci
* reorder tests to run on modal
* skip dpo tests on modal
* run on L4s, A10G takes too long
* increase CPU and RAM for modal test
* run modal tests on A100s
* skip phi test on modal
* env not arg in modal dockerfile
* upgrade pydantic and fastapi for modal tests
* cleanup stray character
* use A10s instead of A100 for modal
2024-02-29 14:26:26 -05:00
Wing Lian
6d4bbb877f
deprecate py 3.9 support, set min pytorch version ( #1343 ) [skip ci]
2024-02-28 12:58:05 -05:00
Wing Lian
5894f0e57e
make mlflow optional ( #1317 )
...
* make mlflow optional
* fix xformers
don't patch swiglu if xformers not working
fix the check for xformers swiglu
* fix install of xformers with extra index url for docker builds
* fix docker build arg quoting
2024-02-26 11:41:33 -05:00
NanoCode012
a359579371
deprecate: pytorch 2.0.1 image ( #1315 ) [skip ci]
...
* deprecate: pytorch 2.0.1 image
* deprecate from main image
* Update main.yml
* Update tests.yml
2024-02-22 11:39:47 +09:00
Wing Lian
ea00dd0852
don't use load and push together ( #1284 )
2024-02-09 14:54:31 -05:00
Wing Lian
aaf54dc730
run the docker image builds and push on gh action gpu runners ( #1218 )
2024-02-09 10:32:54 -05:00
Wing Lian
8da1633124
Revert "run PR e2e docker CI tests in Modal" ( #1220 ) [skip ci]
2024-01-26 16:50:44 -05:00
Wing Lian
36d053f6f0
run PR e2e docker CI tests in Modal ( #1217 ) [skip ci]
...
* wip modal for ci
* handle falcon layernorms better
* update
* rebuild the template each time with the pseudo-ARGS
* fix ref
* update tests to use modal
* cleanup ci script
* make sure to install jinja2 also
* kickoff the gh action on gh hosted runners and specify num gpus
2024-01-26 16:13:27 -05:00
Wing Lian
1b180034c7
ensure the tests use the same version of torch as the latest base docker images ( #1215 ) [skip ci]
2024-01-26 10:38:30 -05:00
Wing Lian
74c72ca5eb
drop py39 docker images, add py311, upgrade pytorch to 2.1.2 ( #1205 )
...
* drop py39 docker images, add py311, upgrade pytorch to 2.1.2
* also allow the main build to be manually triggered
* fix workflow_dispatch in yaml
2024-01-26 00:38:49 -05:00
Wing Lian
badda3783b
make sure to register the base chatml template even if no system message is provided ( #1207 )
2024-01-25 10:38:08 -05:00
Wing Lian
0f77b8d798
add commit message option to skip docker image builds in ci ( #1168 ) [skip ci]
2024-01-22 19:55:36 -05:00
Wing Lian
ece0211996
Agnostic cloud gpu docker image and Jupyter lab ( #1097 )
2024-01-15 22:37:54 -05:00
Hamel Husain
2dc431078c
Add link on README to Docker Debugging ( #1107 )
...
* add docker debug
* Update docs/debugging.md
Co-authored-by: Wing Lian <wing.lian@gmail.com >
* explain editable install
* explain editable install
* upload new video
* add link to README
* Update README.md
* Update README.md
* chore: lint
* make sure to lint markdown too
---------
Co-authored-by: Wing Lian <wing.lian@gmail.com >
2024-01-12 08:51:35 -05:00
Mark Saroufim
44ba616da2
Fix broken pypi.yml ( #1099 ) [skip ci]
2024-01-11 12:35:31 -05:00
Wing Lian
6c19e9302a
add python 3.11 to the matrix for unit tests ( #1085 ) [skip ci]
2024-01-10 13:02:01 -05:00
Wing Lian
9032e610b1
use tags again for test image, only run docker e2e after pre-commit checks ( #1081 )
2024-01-10 09:04:56 -05:00