Grant Holmes (Ren)
850c1a5f8d
Add FSDP v2 swap memory support + QLoRA compatibility fixes ( #3167 )
...
Co-authored-by: salman <salman.mohammadi@outlook.com >
2025-09-26 10:23:59 +01:00
Dan Saunders
79ddaebe9a
Add ruff, remove black, isort, flake8, pylint ( #3092 )
...
* black, isort, flake8 -> ruff
* remove unused
* add back needed import
* fix
2025-08-23 23:37:33 -04:00
salman
294c7fe7a6
Distributed/ND-Parallel ( #2977 )
2025-07-31 15:25:02 -04:00
Wing Lian
0ff2f172ef
Act offload lora fix ( #2928 ) [skip ci]
...
* fix activation offloading with lora
* update w e2e test
* add docs for error
2025-07-24 16:10:04 -04:00
Wing Lian
af8d257aa2
make pad_to_sequence_len default to the same value as sample_packing ( #2941 ) [skip ci]
...
* make pad_to_sequence_len default to the same value as sample_packing
* remove duplicate validation
* fix test
* update description meta
Co-authored-by: NanoCode012 <nano@axolotl.ai >
---------
Co-authored-by: NanoCode012 <nano@axolotl.ai >
2025-07-21 11:40:56 -04:00
Wing Lian
36cbe13d18
activation offloading with cuda streams doesn't work with LoRA ( #2927 )
2025-07-16 11:59:20 -04:00
Wing Lian
e581c15d40
refactor dupes from merge/rebase ( #2919 ) [skip ci]
2025-07-14 10:05:26 -04:00
Wing Lian
af92151a7b
FSDP2 fix validation and add tests ( #2910 )
...
* fix validation and add tests
* remove debugging and add more tests
* remove migrate_fsdp
2025-07-14 09:25:44 -04:00