Commit Graph

93 Commits

Author SHA1 Message Date
Wing Lian
ece0211996 Agnostic cloud gpu docker image and Jupyter lab (#1097) 2024-01-15 22:37:54 -05:00
Wing Lian
0abf4d6504 update PR template so we can capture twitter or discord handles (#1121) [skip ci]
* update PR template so we can capture twitter or discord handles [skip ci]

* ensure that the PR template is in the correct place
2024-01-14 16:19:01 -05:00
Hamel Husain
2dc431078c Add link on README to Docker Debugging (#1107)
* add docker debug

* Update docs/debugging.md

Co-authored-by: Wing Lian <wing.lian@gmail.com>

* explain editable install

* explain editable install

* upload new video

* add link to README

* Update README.md

* Update README.md

* chore: lint

* make sure to lint markdown too

---------

Co-authored-by: Wing Lian <wing.lian@gmail.com>
2024-01-12 08:51:35 -05:00
Mark Saroufim
44ba616da2 Fix broken pypi.yml (#1099) [skip ci] 2024-01-11 12:35:31 -05:00
Wing Lian
6c19e9302a add python 3.11 to the matrix for unit tests (#1085) [skip ci] 2024-01-10 13:02:01 -05:00
Wing Lian
9032e610b1 use tags again for test image, only run docker e2e after pre-commit checks (#1081) 2024-01-10 09:04:56 -05:00
Wing Lian
ec02b7cc4e Update FUNDING.yml [skip ci] 2024-01-09 22:15:27 -05:00
Wing Lian
3b4c646f87 Update FUNDING.yml with bitcoin (#1079) [skip ci] 2024-01-09 21:56:52 -05:00
Wing Lian
788649fe95 attempt to also run e2e tests that needs gpus (#1070)
* attempt to also run e2e tests that needs gpus

* fix stray quote

* checkout specific github ref

* dockerfile for tests with proper checkout

ensure wandb is dissabled for docker pytests
clear wandb env after testing
clear wandb env after testing
make sure to provide a default val for pop
tryin skipping wandb validation tests
explicitly disable wandb in the e2e tests
explicitly report_to None to see if that fixes the docker e2e tests
split gpu from non-gpu unit tests
skip bf16 check in test for now
build docker w/o cache since it uses branch name ref
revert some changes now that caching is fixed
skip bf16 check if on gpu w support

* pytest skip for auto-gptq requirements

* skip mamba tests for now, split multipack and non packed lora llama tests

* split tests that use monkeypatches

* fix relative import for prev commit

* move other tests using monkeypatches to the correct run
2024-01-09 21:23:23 -05:00
Wing Lian
7f381750d9 Update FUNDING.yml for Kofi link (#1067) 2024-01-08 19:26:51 -05:00
Hamel Husain
9ca358b671 Simplify Docker Unit Test CI (#1055) [skip ci]
* Update tests-docker.yml

* Update tests-docker.yml

* run ci tests on ci yaml updates

---------

Co-authored-by: Wing Lian <wing.lian@gmail.com>
2024-01-06 08:20:33 -05:00
JinK
553c80f79a streaming multipack for pretraining dataset (#959)
* [Feat] streaming multipack

* WIP make continued pretraining work w multipack

* fix up hadrcoding, lint

* fix dict check

* update test for updated pretraining multipack code

* fix hardcoded data collator fix for multipack pretraining

* fix the collator to be the max length for multipack pretraining

* don't bother with latest tag for test

* cleanup docker build/test

---------

Co-authored-by: jinwonkim93@github.com <jinwonkim>
Co-authored-by: Wing Lian <wing.lian@gmail.com>
2024-01-05 22:13:21 -05:00
Hamel Husain
eb4c99431b Update tests-docker.yml (#1052) [skip ci] 2024-01-05 14:26:18 -05:00
Wing Lian
bcc78d8fa3 bump transformers and update attention class map name (#1023)
* bump transformers and update attention class map name

* also run the tests in docker

* add mixtral e2e smoke test

* fix base name for docker image in test

* mixtral lora doesn't seem to work, at least check qlora

* add testcase for mixtral w sample packing

* check monkeypatch for flash attn multipack

* also run the e2e tests in docker

* use all gpus to run tests in docker ci

* use privileged mode too for docker w gpus

* rename the docker e2e actions for gh ci

* set privileged mode for docker and update mixtral model self attn check

* use fp16/bf16 for mixtral w fa2

* skip e2e tests on docker w gpus for now

* tests to validate mistral and mixtral patches

* fix rel import
2024-01-03 12:11:04 -08:00
Wing Lian
37820f6540 support for cuda 12.1 (#989) 2023-12-22 11:08:22 -05:00
Hamel Husain
2e61dc3180 Add tests to Docker (#993) 2023-12-22 06:37:20 -08:00
Hamel Husain
62ba1609b6 bump actions versions 2023-12-21 08:54:08 -08:00
Wing Lian
161bcb6517 Dockerfile torch fix (#987)
* add torch to requirements.txt at build time to force version to stick

* fix xformers check

* better handling of xformers based on installed torch version

* fix for ci w/o torch
2023-12-21 09:38:20 -05:00
Wing Lian
40a6362c92 support for mamba (#915)
* support for mamba

* more mamba fixes

* use fork for mamba kwargs fix

* grad checkpointing doesn't work

* fix extras for mamaba

* mamba loss fix

* use fp32 and remove verbose logging

* mamba fixes

* fix collator for mamba

* set model_type on training_args

* don't save safetensors for mamba

* update mamba config to disable safetensor checkpooints, install for tests

* no evals for mamba tests

* handle save_pretrained

* handle unused safetensors arg
2023-12-09 12:10:41 -05:00
Wing Lian
0de1457189 try #2: pin hf transformers and accelerate to latest release, don't reinstall pytorch (#867)
* isolate torch from the requirements.txt

* fix typo for removed line ending

* pin transformers and accelerate to latest releases

* try w auto-gptq==0.5.1

* update README to remove manual peft install

* pin xformers to 0.0.22

* bump flash-attn to 2.3.3

* pin flash attn to exact version
2023-11-16 10:42:36 -05:00
Wing Lian
70157ccb8f add a latest tag for regular axolotl image, cleanup extraneous print statement (#746) 2023-10-19 12:28:29 -04:00
Wing Lian
2aa1f71464 fix pytorch 2.1.0 build, add multipack docs (#722) 2023-10-13 08:57:28 -04:00
Wing Lian
7f2618b5f4 add docker images for pytorch 2.10 (#697) 2023-10-07 12:23:31 -04:00
NanoCode012
90e0d673f7 Feat: Add config yaml to section for reprod in bug-report.yaml (#667)
* Update bug-report.yaml

* Update bug-report.yaml

* Update bug-report.yaml
2023-10-03 23:38:42 +09:00
Wing Lian
f4868d733c make sure we also run CI tests when requirements.txt changes (#663) 2023-10-02 08:43:40 -04:00
Wing Lian
5b0bc48fbc add mistral e2e tests (#649)
* mistral e2e tests

* make sure to enable flash attention for the e2e tests

* use latest transformers full sha

* uninstall first
2023-09-29 00:22:40 -04:00
Wing Lian
b6ab8aad62 Mistral flash attn packing (#646)
* add mistral monkeypatch

* add arg for decoder attention masl

* fix lint for duplicate code

* make sure to update transformers too

* tweak install for e2e

* move mistral patch to conditional
2023-09-27 18:41:00 -04:00
Maxime
5d931cc042 Only run tests when a change to python files is made (#614)
* Update tests.yml

* Update .github/workflows/tests.yml

---------

Co-authored-by: Wing Lian <wing.lian@gmail.com>
2023-09-20 22:02:04 -04:00
Wing Lian
62a774140b Fix for check with cfg and merge_lora (#600) 2023-09-18 21:14:32 -04:00
Wing Lian
1078d3eae7 E2e passing tests (#576)
* run e2e tests after all other checks have passed

* tweak tests so they get run on PRs or push to main

* change dependent action for chcecking

* one test workflow to rule them all

* no need for custom action, just use needs

* whoops, python version should be a string

* e2e tests can run on any available gpu
2023-09-15 01:03:49 -04:00
Wing Lian
24146733db E2e device cuda (#575)
* use torch.cuda.current_device() instead of local_rank

* ignore NVML errors for gpu stats

* llama lora packing e2e tests
2023-09-14 22:49:27 -04:00
Wing Lian
9218ebecd2 e2e testing (#574) 2023-09-14 21:56:11 -04:00
Wing Lian
772cd870d4 fix the sed command to replace the version w the tag
Some checks failed
pre-commit / pre-commit (push) Has been cancelled
publish pypi / Upload release to PyPI (push) Has been cancelled
PyTest / test (3.10) (push) Has been cancelled
PyTest / test (3.9) (push) Has been cancelled
2023-09-11 13:44:19 -04:00
Wing Lian
bcbc9597e9 replace tags, build dist for pypi publish (#553)
* replace tags, build dist for pypi publish

* missing trailing comma
2023-09-11 13:25:41 -04:00
Wing Lian
20ed4c1f9e pypi on tag push (#552) 2023-09-11 10:33:42 -04:00
Wing Lian
c5dedb17ad remove with section, doesn't seem to work (#551) 2023-09-11 10:27:17 -04:00
Wing Lian
b56503d423 publish to pypi workflow on tagged release (#549) 2023-09-11 09:44:47 -04:00
Wing Lian
34c0a86a11 update readme to point to direct link to runpod template, cleanup install instrucitons (#532)
* update readme to point to direct link to runpod template, cleanup install instrucitons

* default install flash-attn and auto-gptq now too

* update readme w flash-attn extra

* fix version in setup
2023-09-08 11:58:54 -04:00
Wing Lian
3355706e22 Add support for GPTQ using native transformers/peft (#468)
* auto gptq support

* more tweaks and add yml

* remove old gptq docker

* don't need explicit peft install for tests

* fix setup.py to use extra index url

install torch for tests
fix cuda version for autogptq index
set torch in requirements so that it installs properly
move gptq install around to work with github cicd

* gptq doesn't play well with sample packing

* address pr feedback

* remove torch install for now

* set quantization_config from model config

* Fix the implementation for getting quant config from model config
2023-09-05 12:43:22 -04:00
Wing Lian
96deb6bd67 recast loralayer, norm, lmhead + embed token weights per original qlora (#393)
* recast loralayer, norm, lmhead + embed token weights per original qlora

* try again for the fix

* refactor torch dtype picking

* linter fixes

* missing import for LoraLayer

* fix install for tests now that peft is involved
2023-08-21 18:41:12 -04:00
mhenrichsen
cf6654769a flash attn pip install (#426)
* flash attn pip

* add packaging

* add packaging to apt get

* install flash attn in dockerfile

* remove unused whls

* add wheel

* clean up pr

fix packaging requirement for ci
upgrade pip for ci
skip build isolation for requiremnents to get flash-attn working
install flash-attn seperately

* install wheel for ci

* no flash-attn for basic cicd

* install flash-attn as pip extras

---------

Co-authored-by: Ubuntu <mgh@mgh-vm.wsyvwcia0jxedeyrchqg425tpb.ax.internal.cloudapp.net>
Co-authored-by: mhenrichsen <some_email@hey.com>
Co-authored-by: Mads Henrichsen <mads@BrbartiendeMads.lan>
Co-authored-by: Wing Lian <wing.lian@gmail.com>
2023-08-18 19:00:27 -04:00
Wing Lian
d3d6fd6ae6 just resort to tags ans use main-latest (#424) 2023-08-16 00:39:57 -04:00
NanoCode012
b7449a997f Fix(template): Inform to place stack trace to Issue (#417)
* Fix(template): Inform to place stack trace to Issue

* Update following suggestions

Co-authored-by: Wing Lian <wing.lian@gmail.com>

---------

Co-authored-by: Wing Lian <wing.lian@gmail.com>
2023-08-16 11:55:48 +09:00
Wing Lian
5f80b3560b use inputs for image rather than outputs for docker metadata (#420) 2023-08-15 18:26:59 -04:00
Wing Lian
7af816699e tag with latest as well for axolotl-runpod (#418)
* tag with latest as well for axolotl-runpod

* no dev branch for now
2023-08-15 15:30:41 -04:00
NanoCode012
7ad37cb6d7 Fix(template): Remove iPhone/android from Issue template (#407) 2023-08-15 22:32:51 +09:00
lightningRalf
31db0ecce4 add templates, CoC and contributing guide (#126)
* add templates, CoC and contributing guide

* Update .github/SECURITY.md

correct responsible person

Co-authored-by: Wing Lian <wing.lian@gmail.com>

* Update bug-report.yaml

axolotl version switch with axolotl branch-commit

* update CONTRIBUTING doc

* update reporting link

* linter fixes

* chore: fix linter

---------

Co-authored-by: Wing Lian <wing.lian@gmail.com>
Co-authored-by: NanoCode012 <kevinvong@rocketmail.com>
2023-08-15 07:41:05 -04:00
Wing Lian
2dafa730ef Create FUNDING.yml 2023-08-13 00:30:34 -04:00
Wing Lian
918f1b0dfb revert previous change and build ax images w docker on gpu (#371) 2023-08-12 20:23:00 -04:00
Wing Lian
c3fde36ada attempt to run non-base docker builds on regular cpu hosts (#369) 2023-08-12 19:07:38 -04:00