Commit Graph

49 Commits

Author SHA1 Message Date
Wing Lian
955cca41fc don't explicitly set cpu pytorch version (#1986)
use a constraint file
use min version of xformers
don't install autoawq with pytorch 2.5.0
debugging for errors
upgrade pip first
fix action yml
add back try/except
retry w/o constraint
use --no-build-isolation
show torch version
install setuptools and wheel
add back try/except
2024-10-21 19:50:50 -04:00
Wing Lian
e12a2130e9 first pass at pytorch 2.5.0 support (#1982)
* first pass at pytorch 2.5.0 support

* attempt to install causal_conv1d with mamba

* gracefully handle missing xformers

* fix import

* fix incorrect version, add 2.5.0

* increase tests timeout
2024-10-21 11:00:45 -04:00
Wing Lian
d20b48a61e only install torchao for torch versions >= 2.4.0 (#1963) 2024-10-12 20:53:48 -04:00
Wing Lian
e8d3da0081 upgrade pytorch from 2.4.0 => 2.4.1 (#1950)
* upgrade pytorch from 2.4.0 => 2.4.1

* update xformers for updated pytorch version

* handle xformers version case for torch==2.3.1
2024-10-09 11:53:56 -04:00
Wing Lian
dcbff16983 run nightly ci builds against upstream main (#1851)
* run nightly ci builds against upstream main

* add test badges

* run the multigpu tests against nightly main builds too
2024-08-22 13:10:54 -04:00
Wing Lian
22680913f3 Bump deepspeed 20240727 (#1790)
* pin deepspeed to 0.14.4 otherwise it doesn't play nice with trl

* Add test to import to try to trigger import dependencies
2024-07-27 10:24:11 -04:00
Wing Lian
e6b299dd79 bump flash attention to 2.6.2 (#1781) [skip ci] 2024-07-23 19:54:15 -04:00
Wing Lian
78e12f8ca5 add basic support for the optimi adamw optimizer (#1727)
* add support for optimi_adamw optimizer w kahan summation

* pydantic validator for optimi_adamw

* workaround for setting optimizer for fsdp

* make sure to install optimizer packages

* make sure to have parity for model parameters passed to optimizer

* add smoke test for optimi_adamw optimizer

* don't use foreach optimi by default
2024-07-14 19:12:57 -04:00
Wing Lian
98af5388ba bump flash attention 2.5.8 -> 2.6.1 (#1738)
* bump flash attention 2.5.8 -> 2.6.1

* use triton implementation of cross entropy from flash attn

* add smoke test for flash attn cross entropy patch

* fix args to xentropy.apply

* handle tuple from triton loss fn

* ensure the patch tests run independently

* use the wrapper already built into flash attn for cross entropy

* mark pytest as forked for patches

* use pytest xdist instead of forked, since cuda doesn't like forking

* limit to 1 process and use dist loadfile for pytest

* change up pytest for fixture to reload transformers w monkeypathc
2024-07-14 19:11:31 -04:00
Akshaya Shanbhogue
4512738a73 bump xformers to 0.0.27 (#1740)
* Update requirements.txt

Preserve compatibility with torch 2.3.1. [Reference](https://github.com/facebookresearch/xformers/issues/1052)

* fix setup.py to extract the current xformers dep from requirements for replacement

* xformers 0.0.27 wheels not built for torch 2.3.0

---------

Co-authored-by: Wing Lian <wing.lian@gmail.com>
2024-07-13 14:04:31 -04:00
Wing Lian
851ccb1237 bump deepspeed for fix for grad norm compute putting tensors on different devices (#1699) 2024-06-09 17:13:28 -04:00
Wing Lian
ef223519c9 update deps (#1663) [skip ci]
* update deps and tweak logic so axolotl is pip installable

* use vcs url format

* using dependency_links isn't supported per docs)
2024-05-28 11:23:34 -04:00
Wing Lian
039e2a0370 bump versions of deps (#1621)
* bump versions of deps

* bump transformers too

* fix xformers deps and include s3fs install
2024-05-15 13:27:44 -04:00
Wing Lian
05b398a072 fix some of the edge cases for Jamba (#1452)
* fix some of the edge cases for Jamba

* update requirements for jamba
2024-03-29 02:38:02 -04:00
Wing Lian
dd449c5cd8 support galore once upstreamed into transformers (#1409)
* support galore once upstreamed into transformers

* update module name for llama in readme and fix typing for all linear

* bump trl for deprecation fixes from newer transformers

* include galore as an extra and install in docker image

* fix optim_args type

* fix optim_args

* update dependencies for galore

* add galore to cicd dockerfile
2024-03-19 09:26:35 -04:00
Wing Lian
58b0d4b0d8 update flash attention for gemma support: (#1368) 2024-03-06 10:08:54 -05:00
NanoCode012
5be8b555a0 fix: checkpoint saving with deepspeed (#1321) 2024-02-27 15:46:44 +09:00
Maxime
16482796b0 add lion-pytorch optimizer (#1299) [skip ci]
* add lion-pytorch optimizer

* update pydantic to support lion optimizer

---------

Co-authored-by: Wing Lian <wing.lian@gmail.com>
2024-02-26 18:45:14 -05:00
Wing Lian
5894f0e57e make mlflow optional (#1317)
* make mlflow optional

* fix xformers

don't patch swiglu if xformers not working
fix the check for xformers swiglu

* fix install of xformers with extra index url for docker builds

* fix docker build arg quoting
2024-02-26 11:41:33 -05:00
Maxime
fac2d98c26 Add MPS support (#1264)
* add mps support

* linter stuff

* CI fixes

* install packaging for various tests

* Update setup.py

* Revert "install packaging for various tests"

This reverts commit 980e7aa44d.

* Revert "CI fixes"

This reverts commit 4609e3b166.

---------

Co-authored-by: Wing Lian <wing.lian@gmail.com>
2024-02-12 08:30:32 -05:00
Wing Lian
8f2b591baf set torch version to what is installed during axolotl install (#1234) 2024-01-31 08:47:34 -05:00
Wing Lian
a01b998c0f Update deps 202401 (#1204) [skip ci]
* update deps

* xformers fix too
2024-01-25 10:11:49 -05:00
Wing Lian
1427d5b502 prepare for release 0.4.0 (#1175)
Some checks failed
publish pypi / Upload release to PyPI (push) Has been cancelled
2024-01-24 15:00:28 -05:00
Wing Lian
8a49309489 upgrade deepspeed to 0.13.1 for mixtral fixes (#1189) [skip ci]
* upgrade deepspeed to 0.13.1 for mixtral fixes

* move deepspeed-kernels install to setup.py
2024-01-24 14:26:40 -05:00
NanoCode012
d69ba2b0b7 fix: warn user to install mamba_ssm package (#1019) 2024-01-10 02:50:56 -05:00
Casper
9be92d1448 Separate AutoGPTQ dep to pip install -e .[auto-gptq] (#1077)
* Separate AutoGPTQ dep to `pip install -e .[auto-gptq]`

* Fix code review
2024-01-09 23:39:25 +01:00
Wing Lian
732851f105 Phi2 rewrite (#1058)
* restore to current phi modeling code from phi-2

* enable gradient checkpointing

* don't cast everything to float32 all the time

* gradient checkpointing for phi2 ParallelBlock module too

* fix enabling flash attn for phi2

* add comment about import

* fix phi2 example

* fix model type check for tokenizer

* revert float32 -> bf16 casting changes

* support fused dense flash attn

* fix the repo for flash-attn

* add package name for subdir pkg

* fix the data collator when not using sample packing

* install packaging for pytests in ci

* also fix setup to not install flash attn fused dense subdir if not extras

* split out the fused-dense-lib in extra requires

* don't train w group_by_length for phi

* update integration test to use phi2

* set max steps and save steps for phi e2e tests

* try to workaround ssave issue in ci

* skip phi2 e2e test for now
2024-01-08 14:04:22 -05:00
Wing Lian
161bcb6517 Dockerfile torch fix (#987)
* add torch to requirements.txt at build time to force version to stick

* fix xformers check

* better handling of xformers based on installed torch version

* fix for ci w/o torch
2023-12-21 09:38:20 -05:00
Wing Lian
40a6362c92 support for mamba (#915)
* support for mamba

* more mamba fixes

* use fork for mamba kwargs fix

* grad checkpointing doesn't work

* fix extras for mamaba

* mamba loss fix

* use fp32 and remove verbose logging

* mamba fixes

* fix collator for mamba

* set model_type on training_args

* don't save safetensors for mamba

* update mamba config to disable safetensor checkpooints, install for tests

* no evals for mamba tests

* handle save_pretrained

* handle unused safetensors arg
2023-12-09 12:10:41 -05:00
Casper
06ae39200b Pin flash-attn to 2.3.3 (#919) 2023-12-07 07:36:52 +01:00
Casper
a045db0214 Mistral: Sliding Window Attention with Flash Attention and Sample Packing (#732)
* Implement Mistral FA + SWA + Sample Packing

* Handle unbroadcastable tensor

* chore: lint

* Simplify _prepare_decoder_attention_mask

* Uncomment window size

* Upgrade flash-attn to minimum of 2.3.0 to support SWA

* Add original condition to avoid error during inference

* chore: lint

* use torchscript to prevent oom

* chore: pylint

---------

Co-authored-by: Wing Lian <wing.lian@gmail.com>
2023-10-16 15:13:46 -04:00
Wing Lian
7f2027d93f tweak for xformers install w pytorch 2.1.0 (#727) 2023-10-13 15:21:17 -04:00
Wing Lian
8d288a2ad4 workaround for installing xformers w torch 2.1.0 (#725) 2023-10-13 11:19:30 -04:00
Wing Lian
c25ba7939b update README w deepspeed info (#605) 2023-09-22 00:15:52 -04:00
Wing Lian
772cd870d4 fix the sed command to replace the version w the tag
Some checks failed
pre-commit / pre-commit (push) Has been cancelled
publish pypi / Upload release to PyPI (push) Has been cancelled
PyTest / test (3.10) (push) Has been cancelled
PyTest / test (3.9) (push) Has been cancelled
2023-09-11 13:44:19 -04:00
Wing Lian
6c5fbe6223 add long_description for pypi push (#555) 2023-09-11 13:34:29 -04:00
Wing Lian
34c0a86a11 update readme to point to direct link to runpod template, cleanup install instrucitons (#532)
* update readme to point to direct link to runpod template, cleanup install instrucitons

* default install flash-attn and auto-gptq now too

* update readme w flash-attn extra

* fix version in setup
2023-09-08 11:58:54 -04:00
Wing Lian
3355706e22 Add support for GPTQ using native transformers/peft (#468)
* auto gptq support

* more tweaks and add yml

* remove old gptq docker

* don't need explicit peft install for tests

* fix setup.py to use extra index url

install torch for tests
fix cuda version for autogptq index
set torch in requirements so that it installs properly
move gptq install around to work with github cicd

* gptq doesn't play well with sample packing

* address pr feedback

* remove torch install for now

* set quantization_config from model config

* Fix the implementation for getting quant config from model config
2023-09-05 12:43:22 -04:00
Wing Lian
96deb6bd67 recast loralayer, norm, lmhead + embed token weights per original qlora (#393)
* recast loralayer, norm, lmhead + embed token weights per original qlora

* try again for the fix

* refactor torch dtype picking

* linter fixes

* missing import for LoraLayer

* fix install for tests now that peft is involved
2023-08-21 18:41:12 -04:00
mhenrichsen
cf6654769a flash attn pip install (#426)
* flash attn pip

* add packaging

* add packaging to apt get

* install flash attn in dockerfile

* remove unused whls

* add wheel

* clean up pr

fix packaging requirement for ci
upgrade pip for ci
skip build isolation for requiremnents to get flash-attn working
install flash-attn seperately

* install wheel for ci

* no flash-attn for basic cicd

* install flash-attn as pip extras

---------

Co-authored-by: Ubuntu <mgh@mgh-vm.wsyvwcia0jxedeyrchqg425tpb.ax.internal.cloudapp.net>
Co-authored-by: mhenrichsen <some_email@hey.com>
Co-authored-by: Mads Henrichsen <mads@BrbartiendeMads.lan>
Co-authored-by: Wing Lian <wing.lian@gmail.com>
2023-08-18 19:00:27 -04:00
Wing Lian
bbc5bc5791 Merge pull request #108 from OpenAccess-AI-Collective/docker-gptq
default to qlora support, make gptq specific image
2023-05-30 15:07:04 -04:00
NanoCode012
37293dce07 Apply isort then black 2023-05-31 02:53:53 +09:00
NanoCode012
8b617cc7f6 Lint setup.py 2023-05-31 02:53:53 +09:00
Wing Lian
6ef96f569b default to qlora support, make gptq specific image 2023-05-29 20:34:41 -04:00
Wing Lian
2bc1a5bde1 black formatting 2023-05-10 16:01:08 -04:00
Wing Lian
990bec63e6 docker layer caching, build w axolotl from base build 2023-05-07 17:16:05 -04:00
Wing Lian
f50de1b1cb handle empty lines 2023-04-19 08:03:34 -04:00
Wing Lian
4131183115 fix install to work with latest alpaca lora 4bit 2023-04-17 12:45:12 -04:00
Wing Lian
77fca25f1b 4bit quantized support (wip) 2023-04-17 11:37:39 -04:00