* feat: add mistral small 4 * fix: update mistral common * fix: deepcopy when passing in tokenizer * feat: add doc on reasoning and thinking section * fix: don't use custom tokenizer and quantize experts * chore: update docs and configs * chore: update doc to follow official name * feat: update cce to include mistral4 * chore: move * fix: naming * fix: test mock breaking get_text_config check * fix: enable CCE and add expert block targetting to configs * chore: docs * fix: use act checkpointing * chore: doc * chore: docs * chore: docs
79 lines
1.2 KiB
Plaintext
79 lines
1.2 KiB
Plaintext
--extra-index-url https://huggingface.github.io/autogptq-index/whl/cu118/
|
|
|
|
# START section of dependencies that don't install on Darwin/MacOS
|
|
bitsandbytes==0.49.1
|
|
triton>=3.4.0
|
|
mamba-ssm==1.2.0.post1
|
|
xformers>=0.0.23.post1
|
|
liger-kernel==0.7.0
|
|
# END section
|
|
|
|
packaging==26.0
|
|
huggingface_hub>=1.1.7
|
|
peft>=0.18.1
|
|
tokenizers>=0.22.1
|
|
transformers==5.3.0
|
|
accelerate==1.13.0
|
|
datasets==4.5.0
|
|
deepspeed>=0.18.6,<0.19.0
|
|
trl==0.29.0
|
|
hf_xet==1.3.2
|
|
kernels==0.12.2
|
|
|
|
fla-core==0.4.1
|
|
flash-linear-attention==0.4.1
|
|
|
|
trackio>=0.16.1
|
|
typing-extensions>=4.15.0
|
|
|
|
optimum==1.16.2
|
|
hf_transfer
|
|
sentencepiece
|
|
gradio>=6.2.0,<7.0
|
|
|
|
modal==1.3.0.post1
|
|
pydantic>=2.10.6
|
|
addict
|
|
fire
|
|
PyYAML>=6.0
|
|
requests
|
|
wandb
|
|
einops
|
|
colorama
|
|
numba>=0.61.2
|
|
numpy>=2.2.6
|
|
|
|
# qlora things
|
|
evaluate==0.4.1
|
|
scipy
|
|
nvidia-ml-py==12.560.30
|
|
art
|
|
tensorboard
|
|
python-dotenv==1.0.1
|
|
|
|
# remote filesystems
|
|
s3fs>=2024.5.0
|
|
gcsfs>=2025.3.0
|
|
adlfs>=2024.5.0
|
|
ocifs==1.3.2
|
|
|
|
zstandard==0.22.0
|
|
fastcore
|
|
|
|
# lm eval harness
|
|
lm_eval==0.4.7
|
|
langdetect==1.0.9
|
|
immutabledict==4.2.0
|
|
antlr4-python3-runtime==4.13.2
|
|
|
|
torchao==0.16.0
|
|
openenv-core==0.1.0
|
|
schedulefree==1.4.1
|
|
|
|
axolotl-contribs-lgpl==0.0.7
|
|
axolotl-contribs-mit==0.0.6
|
|
# telemetry
|
|
posthog==6.7.11
|
|
|
|
mistral-common==1.10.0
|