bump hf deps (#2735) [skip ci]

* bump hf deps

* upgrade liger-kernel too

* install cce from fork for transformers fix

* fix reference to vocab size in gemma3 patch

* use padding_idx instead of pad_token_id

* remove fixed gemma3 patch

* use updated cce fork

* fix local mllama cce patches w docstring

* add test for multipack with trainer setup and fix trainer for trainer refactor upstream

* bump modal version

* guard for iterable datasetS

* mllama model arch layout changed in latest transformers

* fix batch sampler with drop_last

* fix: address upstream vlm changes for lora

* fix: update references to old lora target path

* fix: remove mllama fa2 patch due to upstream fix

* fix: lora kernel patch path for multimodal models

* fix: removed mllama from quarto

* run test for came optim on 2.6.0+

* fix fsdp2 patch and remove deprecated patch

* make sure to set sequence_parallel_degree for grpo

* Add SP test for GRPO

* add sp to grpo config for trainer

* use reward_funcs as kwarg to grpo trainer

* fix the comprehension for reward funcs

* reward funcs already passed in as args

* init sp_group right before training

* fix check for adding models to SP context

* make sure to pass args to super

* upgrade deepspeed

* use updated trl and add reasoning flags for vllm

* patch the worker

---------

Co-authored-by: NanoCode012 <nano@axolotl.ai>
This commit is contained in:
Wing Lian
2025-06-05 07:20:33 -07:00
committed by GitHub
parent 787880215b
commit c67910fa6f
33 changed files with 470 additions and 695 deletions

View File

@@ -6,20 +6,20 @@ triton>=3.0.0
mamba-ssm==1.2.0.post1
xformers>=0.0.23.post1
autoawq==0.2.7.post3
liger-kernel==0.5.9
liger-kernel==0.5.10
# END section
packaging==23.2
huggingface_hub==0.31.0
huggingface_hub==0.32.2
peft==0.15.2
transformers==4.51.3
transformers==4.52.3
tokenizers>=0.21.1
accelerate==1.6.0
datasets==3.5.1
deepspeed>=0.15.4
trl==0.17.0
hf_xet==1.1.0
accelerate==1.7.0
datasets==3.6.0
deepspeed>=0.17.0
trl==0.18.1
hf_xet==1.1.2
hqq==0.2.5
optimum==1.16.2