Files
axolotl/requirements.txt
Wing Lian 323da791eb bump transformers to 5.5.4 and trl to latest 1.1.0 (#3603)
* bump transformers to 5.5.4 and trl to latest 1.1.0

* more upgrades

* update peft too

* adapt lora_merge to peft 0.19 layer config API

PEFT 0.19 requires a LoraConfig object on Linear/ParamWrapper/Conv
layer constructors and moved use_rslora, use_dora, fan_in_fan_out,
lora_dropout, and lora_bias into that config. Build the config
per branch in _build_peft_layer_and_get_delta so the merge utility
works with the upgraded peft.

* allow lora_dropout on mixed attention+MoE configs under peft 0.19

PEFT 0.19's convert_peft_config_for_transformers auto-remaps old MoE
target_modules (w1/w2/w3 on Mixtral, etc.) into target_parameters for
transformers v5's fused 3D expert Parameters. Those targets get wrapped
with ParamWrapper, which rejects lora_dropout != 0 because the 3D
einsum can't factor dropout out of lora_B(lora_A(dropout(x))).

Monkeypatch ParamWrapper.__init__ to internally use a copy of the
LoraConfig with lora_dropout=0, so its dropout slot becomes nn.Identity
while the shared config still delivers real dropout to sibling Linear
LoRA layers (attention q/k/v/o). A probe runs the same conversion on a
deep copy to detect the situation and emit a warning before patching.
2026-04-15 09:27:03 -04:00

79 lines
1.2 KiB
Plaintext

--extra-index-url https://huggingface.github.io/autogptq-index/whl/cu118/
# START section of dependencies that don't install on Darwin/MacOS
bitsandbytes==0.49.1
triton>=3.4.0
mamba-ssm==1.2.0.post1
xformers>=0.0.23.post1
liger-kernel==0.7.0
# END section
packaging==26.0
huggingface_hub>=1.1.7
peft>=0.19.0,<0.20.0
tokenizers>=0.22.1
transformers==5.5.4
accelerate==1.13.0
datasets>=4.8.4,<4.9.0
deepspeed>=0.18.6,<0.19.0
trl==1.1.0
hf_xet==1.4.3
kernels==0.13.0
fla-core==0.4.1
flash-linear-attention==0.4.1
trackio>=0.16.1
typing-extensions>=4.15.0
optimum==1.16.2
hf_transfer
sentencepiece
gradio>=6.2.0,<7.0
modal==1.3.0.post1
pydantic>=2.10.6
addict
fire
PyYAML>=6.0
requests
wandb
einops
colorama
numba>=0.61.2
numpy>=2.2.6
# qlora things
evaluate==0.4.1
scipy
nvidia-ml-py==12.560.30
art
tensorboard
python-dotenv==1.0.1
# remote filesystems
s3fs>=2024.5.0
gcsfs>=2025.3.0
adlfs>=2024.5.0
ocifs==1.3.2
zstandard==0.22.0
fastcore
# lm eval harness
lm_eval==0.4.11
langdetect==1.0.9
immutabledict==4.2.0
antlr4-python3-runtime==4.13.2
torchao==0.17.0
openenv-core==0.1.0
schedulefree==1.4.1
axolotl-contribs-lgpl==0.0.7
axolotl-contribs-mit==0.0.6
# telemetry
posthog==6.7.11
mistral-common==1.11.0