update transformers to 4.53.1 (#2844) [skip ci]

* update transformers to 4.53.0

* remove attention_mask from signature columns if using packing

* remove attention_mask column from dataloader

* update signature of flash attn forward for ring attn patch

* fix FSDP

* patch ring-flash-attn with upstream signature fix

* fix patch indentation level

* fix the patch

* add batch flattening smoke test with loss check that works in older transformers

* fix patch

* don't drop attention mask for flex

* more fixes

* patch create_causal_mask for packing w flex

* global torch manual_seed fixture

* tweak loss checks

* fix patch and use single batch for flex

* don't need to reload

* fix causal mask patch

* use transformers patch releasE

* make sure env var is string

* make sure to drop attention mask for flex w packing for latest transformers patch release

* tweak loss

* guard on signature columns before removing attention mask

* bump loss

* set remove isn't chainable

* skip slow mistral test in 2.5.1
This commit is contained in:
Wing Lian
2025-07-07 09:35:22 -04:00
committed by GitHub
parent 5a961ecadf
commit 69cd49a7aa
23 changed files with 449 additions and 32 deletions

View File

@@ -114,7 +114,7 @@ extras_require = {
"flash-attn": ["flash-attn==2.8.0.post2"],
"ring-flash-attn": [
"flash-attn==2.8.0.post2",
"ring-flash-attn>=0.1.4",
"ring-flash-attn>=0.1.5",
"yunchang==0.6.0",
],
"deepspeed": [