Files
axolotl/requirements.txt
Wing Lian 98af5388ba bump flash attention 2.5.8 -> 2.6.1 (#1738)
* bump flash attention 2.5.8 -> 2.6.1

* use triton implementation of cross entropy from flash attn

* add smoke test for flash attn cross entropy patch

* fix args to xentropy.apply

* handle tuple from triton loss fn

* ensure the patch tests run independently

* use the wrapper already built into flash attn for cross entropy

* mark pytest as forked for patches

* use pytest xdist instead of forked, since cuda doesn't like forking

* limit to 1 process and use dist loadfile for pytest

* change up pytest for fixture to reload transformers w monkeypathc
2024-07-14 19:11:31 -04:00

46 lines
781 B
Plaintext

--extra-index-url https://huggingface.github.io/autogptq-index/whl/cu118/
packaging==23.2
peft==0.11.1
transformers==4.42.3
tokenizers==0.19.1
bitsandbytes==0.43.1
accelerate==0.32.0
deepspeed @ git+https://github.com/microsoft/DeepSpeed.git@bc48371c5e1fb8fd70fc79285e66201dbb65679b
pydantic==2.6.3
addict
fire
PyYAML>=6.0
requests
datasets==2.19.1
flash-attn==2.6.1
sentencepiece
wandb
einops
xformers==0.0.27
optimum==1.16.2
hf_transfer
colorama
numba
numpy>=1.24.4
# qlora things
evaluate==0.4.1
scipy
scikit-learn==1.2.2
pynvml
art
fschat @ git+https://github.com/lm-sys/FastChat.git@27a05b04a35510afb1d767ae7e5990cbd278f8fe
gradio==3.50.2
tensorboard
python-dotenv==1.0.1
mamba-ssm==1.2.0.post1
# remote filesystems
s3fs
gcsfs
# adlfs
trl==0.9.6
zstandard==0.22.0
fastcore