Wing Lian
aae4337f40
add 12.8.1 cuda to the base matrix ( #2426 )
...
* add 12.8.1 cuda to the base matrix
* use nightly
* bump deepspeed and set no binary
* deepspeed binary fixes hopefully
* install deepspeed by itself
* multiline fix
* make sure ninja is installed
* try with reversion of packaging/setuptools/wheel install
* use license instead of license-file
* try rolling back packaging and setuptools versions
* comment out license for validation for now
* make sure packaging version is consistent
* more parity across tests and docker images for packaging/setuptools
2025-03-21 10:17:25 -04:00
NanoCode012
fd8cb32547
chore: remove redundant py310 from tests ( #2316 )
2025-02-07 21:34:16 -05:00
Wing Lian
a971eb4ce6
Torch 2.6 support for base docker image ( #2312 )
2025-02-05 09:24:02 -05:00
salman
c071a530f7
removing 2.3.1 ( #2294 )
2025-01-28 23:23:44 -05:00
Wing Lian
a4f4a56d77
build causal_conv1d and mamba-ssm into the base image ( #2113 )
...
* build causal_conv1d and mamba-ssm into the base image
* also build base images on changes to Dockerfile-base and base workflow yaml
2024-12-02 18:27:46 -05:00
Wing Lian
ba219b51a5
fix duplicate base build ( #2061 ) [skip ci]
2024-11-14 10:31:19 -05:00
Wing Lian
f68fb71005
update actions version for node16 deprecation ( #2037 ) [skip ci]
...
* update actions version for node16 deprecation
* update pre-commit/action to use 3.0.1 for actions/cache@v4 dep
* update docker/setup-buildx-action too to v3
2024-11-11 15:09:11 -05:00
Wing Lian
9bc3ee6c75
add axolotlai docker hub org to publish list ( #2031 )
...
* add axolotlai docker hub org to publish list
* fix to use latest actions docker metadata version
* fix list in yaml for expected format for action
* missed a change
2024-11-11 09:48:19 -05:00
Wing Lian
3591bcfaf9
add torch 2.5.1 for base image ( #2010 )
2024-10-31 13:27:49 -04:00
Wing Lian
67f744dc8c
add pytorch 2.5.0 base images ( #1979 )
...
* add pytorch 2.5.0 base images
* make sure num examples for debug is zero and fix comparison
2024-10-18 03:36:51 -04:00
Wing Lian
e8d3da0081
upgrade pytorch from 2.4.0 => 2.4.1 ( #1950 )
...
* upgrade pytorch from 2.4.0 => 2.4.1
* update xformers for updated pytorch version
* handle xformers version case for torch==2.3.1
2024-10-09 11:53:56 -04:00
Wing Lian
4ca0a47cfb
add 2.4.1 to base models ( #1953 )
2024-10-09 08:43:11 -04:00
Wing Lian
c5587b45ac
use 12.4.1 instead of 12.4 [skip-ci] ( #1796 )
2024-07-30 08:50:23 -04:00
Wing Lian
d4f6a6b103
fix dockerfile and base builder ( #1795 ) [skip-ci]
2024-07-30 08:34:37 -04:00
Wing Lian
d8d1788ffc
move to supporting mostly 12.1 w 2.3.1 and add new 12.4 with 2.4.0 ( #1793 )
2024-07-30 08:06:11 -04:00
Wing Lian
137d84d1b4
add torch 2.3.1 base image ( #1745 )
2024-07-13 09:41:51 -04:00
Wing Lian
a159724e44
bump trl and accelerate for latest releases ( #1730 )
...
* bump trl and accelerate for latest releases
* ensure that the CI runs on new gh org
* drop kto_pair support since removed upstream
2024-07-10 11:15:44 -04:00
Wing Lian
3319780300
update torch 2.2.1 -> 2.2.2 ( #1622 )
2024-05-15 09:45:27 -04:00
Wing Lian
70185763f6
add torch 2.3.0 to builds ( #1593 )
2024-05-05 18:45:45 -04:00
Wing Lian
da265dd796
fix for accelerate env var for auto bf16, add new base image and expand torch_cuda_arch_list support ( #1413 )
2024-03-26 16:46:19 -04:00
NanoCode012
a359579371
deprecate: pytorch 2.0.1 image ( #1315 ) [skip ci]
...
* deprecate: pytorch 2.0.1 image
* deprecate from main image
* Update main.yml
* Update tests.yml
2024-02-22 11:39:47 +09:00
Wing Lian
aaf54dc730
run the docker image builds and push on gh action gpu runners ( #1218 )
2024-02-09 10:32:54 -05:00
Wing Lian
74c72ca5eb
drop py39 docker images, add py311, upgrade pytorch to 2.1.2 ( #1205 )
...
* drop py39 docker images, add py311, upgrade pytorch to 2.1.2
* also allow the main build to be manually triggered
* fix workflow_dispatch in yaml
2024-01-26 00:38:49 -05:00
Wing Lian
37820f6540
support for cuda 12.1 ( #989 )
2023-12-22 11:08:22 -05:00
Wing Lian
161bcb6517
Dockerfile torch fix ( #987 )
...
* add torch to requirements.txt at build time to force version to stick
* fix xformers check
* better handling of xformers based on installed torch version
* fix for ci w/o torch
2023-12-21 09:38:20 -05:00
Wing Lian
7f2618b5f4
add docker images for pytorch 2.10 ( #697 )
2023-10-07 12:23:31 -04:00
Wing Lian
2c37bf6c21
Prune cuda117 ( #327 )
...
* drop cuda117/torch 1.13.1 from support, pin flash attention to v2.0.1, rm torchvision/torchaudio install
* gptq base build not needed. add sm 9.0 support
2023-07-26 16:27:49 -04:00
Wing Lian
c5df969262
don't use the gha cache w docker
2023-07-22 08:46:21 -04:00
Wing Lian
c58034d48c
use pytorch 2.0.1
2023-07-20 00:47:13 -04:00
Wing Lian
a10da1caff
11.7.0 nvidia/cuda docker images are deprecated, move to 11.7.1
ci-cd-base / build-base (<nil>, 117, 11.7.1, 3.9, 1.13.1) (push) Has been cancelled
ci-cd-base / build-base (<nil>, 118, 11.8.0, 3.10, 2.0.0) (push) Has been cancelled
ci-cd-base / build-base (<nil>, 118, 11.8.0, 3.9, 2.0.0) (push) Has been cancelled
ci-cd-base / build-base (gptq, 118, 11.8.0, 3.9, 2.0.0) (push) Has been cancelled
pre-commit / pre-commit (push) Has been cancelled
PyTest / test (3.10) (push) Has been cancelled
PyTest / test (3.9) (push) Has been cancelled
2023-07-01 00:29:07 -04:00
Wing Lian
d35278aaf1
don't fail fast
2023-06-15 16:01:27 -04:00
Wing Lian
e0011fdf55
Fix base builder, missing tags
2023-05-31 09:52:03 -04:00
Wing Lian
c5b0af1a7e
define python version (3.10) explicitly as string in yaml
2023-05-30 22:23:35 -04:00
Wing Lian
c43c5c84ff
py310, fix cuda arg in deepspeed
2023-05-30 18:02:34 -04:00
Wing Lian
48612f8376
cleanup from pr feedback
2023-05-30 09:56:30 -04:00
Wing Lian
6ef96f569b
default to qlora support, make gptq specific image
2023-05-29 20:34:41 -04:00
Wing Lian
e43bcc6c4f
move CUDA_VERSION_BNB arg inside of stage build scope
2023-05-29 13:30:15 -04:00
Wing Lian
21f17cca69
bnb fixes
2023-05-29 00:06:35 -04:00
Wing Lian
312b8d51d6
update docker to compile latest bnb to properly support qlora
2023-05-27 12:36:53 -04:00
Wing Lian
f950a881e1
cuda, pytorch matrix for base builds
2023-05-22 12:12:08 -04:00
Wing Lian
990bec63e6
docker layer caching, build w axolotl from base build
2023-05-07 17:16:05 -04:00
Wing Lian
2734e3f1a2
build base separately
...
fix arg order for image
fix dockerfile var excaping
move args around
2023-05-07 10:56:12 -04:00