bump hf deps (#2735) [skip ci]

* bump hf deps * upgrade liger-kernel too * install cce from fork for transformers fix * fix reference to vocab size in gemma3 patch * use padding_idx instead of pad_token_id * remove fixed gemma3 patch * use updated cce fork * fix local mllama cce patches w docstring * add test for multipack with trainer setup and fix trainer for trainer refactor upstream * bump modal version * guard for iterable datasetS * mllama model arch layout changed in latest transformers * fix batch sampler with drop_last * fix: address upstream vlm changes for lora * fix: update references to old lora target path * fix: remove mllama fa2 patch due to upstream fix * fix: lora kernel patch path for multimodal models * fix: removed mllama from quarto * run test for came optim on 2.6.0+ * fix fsdp2 patch and remove deprecated patch * make sure to set sequence_parallel_degree for grpo * Add SP test for GRPO * add sp to grpo config for trainer * use reward_funcs as kwarg to grpo trainer * fix the comprehension for reward funcs * reward funcs already passed in as args * init sp_group right before training * fix check for adding models to SP context * make sure to pass args to super * upgrade deepspeed * use updated trl and add reasoning flags for vllm * patch the worker --------- Co-authored-by: NanoCode012 <nano@axolotl.ai>
2025-06-05 07:20:33 -07:00
parent 787880215b
commit c67910fa6f
33 changed files with 470 additions and 695 deletions
--- a/requirements.txt
+++ b/requirements.txt
@@ -6,20 +6,20 @@ triton>=3.0.0
 mamba-ssm==1.2.0.post1
 xformers>=0.0.23.post1
 autoawq==0.2.7.post3
-liger-kernel==0.5.9
+liger-kernel==0.5.10
 # END section

 packaging==23.2

-huggingface_hub==0.31.0
+huggingface_hub==0.32.2
 peft==0.15.2
-transformers==4.51.3
+transformers==4.52.3
 tokenizers>=0.21.1
-accelerate==1.6.0
-datasets==3.5.1
-deepspeed>=0.15.4
-trl==0.17.0
-hf_xet==1.1.0
+accelerate==1.7.0
+datasets==3.6.0
+deepspeed>=0.17.0
+trl==0.18.1
+hf_xet==1.1.2
 hqq==0.2.5

 optimum==1.16.2