Wing Lian
|
830e9f7eaf
|
automatically enable tf32 if supported (#3473) [skip ci]
* automatically enable tf32 if supported
* update fixtures
* handle only when True
* Address CR comments
* address readability from pr comment
* simplify
|
2026-03-16 23:47:00 -04:00 |
|
VED
|
a806704e94
|
moe quant patch for merge miss match (#3483)
* moe quant patch for merge miss match
* lint
* revert test + fix moe patch
* comment fixxes
* e2e tests
* mismatch fixx tested
* mis match fix wwith vllm compatablity + test
* comment lint
* fix: missing os import, duplicate no op
* chore: simplify comments
---------
Co-authored-by: NanoCode012 <nano@axolotl.ai>
|
2026-03-15 22:10:30 -04:00 |
|
NanoCode012
|
6c8c73e5a4
|
fix(validation): add validation for lora target linear with quantize experts (#3461)
* fix: add validation for lora target linear with quantize experts
* chore: fix lint
* chore: comment
* fix: missing link on readme
|
2026-03-06 09:19:05 -05:00 |
|
NanoCode012
|
945c8aeb10
|
Fix: quantize and target moe layers in transformers v5 for adapters and many misc fixes (#3439)
* fix: saving clones state dict
* fix: apply fix for only CP mode
* fix: add dropout check when using lora target param
* fix: re-add patch from transformers PR #39866
* feat: add moe quant to test by ved
* fix: try match target param properly end with
* fix: clear cache per param quant
* fix: attempt on-load quantize experts instead of post-load
* fix: attempt disable async load
* chore: add log
* chore: adjust log
* fix: remove cuda alloc for moe and enable async load
* chore: remove leftover logs
* chore: add extra empty cache
* fix(doc): clarify support
* fix: handle fsdp2 for paramwrapper dtensor
* feat: attempt to quant experts in 8bit mode too
* feat: attempt to release bf16 experts from vram
* feat: upgrade cce
* fix: fsdp2 init_sharded_param load int8/uint4 dtensor as
require_grad=true on init
* fix: remove unnecessary gc and empty cache
* Revert "fix: remove unnecessary gc and empty cache"
This reverts commit 1d54518990.
* fix: do not call full_tensor on non-dtensors
* fix: attempt to address fsdp2 with quant exp high loss
* fix: attempt lora quant experts wrong dim
* fix: ensure require_grad patch applied for lora 8bit
* fix: attempt lora 8bit fsdp2
* fix: attribute access on save for lora 8bit fsdp2
* fix: wrong weight attrib access
* chore(refactor): add config, re-arrange position of patches, clean
comments
* feat: add example docs
* chore: cherry pick trinity fixes from PR 3399
* chore: comments refactor; add guards
* fix: guard using wrong key
* fix: mamba save does not accept main process param
* fix: guard prevent double hook
* fix: move gc to upper scope
* chore: add comment on proxy forward patch
* fix: add comment to clarify
* feat: add test idempotency
* fix: AttributeError: `e_score_correction_bias` is not an nn.Parameter
* fix: AttributeError: 'NoneType' object has no attribute 'to'
* fix: update docs on cpu_ram_efficient_loading
|
2026-03-03 10:06:23 -05:00 |
|