* consolidate behavioud of routing in scattermoe kernels
* collect telemetry on best chosen autotuned kernel
* properly collect data
* Fix property name and get smem too
* handle issues raised by coderabbit
* add tests for parity before refactoring
* upgrade transformers==5.3.0 trl==0.29.0 kernels
* use latest deepspeed fixes
* use corect image for cleanup
* fix test outputs for tokenizer fixes upstream
* fix import:
* keep trl at 0.28.0
* handle updated API
* use latest trl since 0.28.0 doesn't work with latest transformers
* use trl experimental for pad to length
* monkeypatch trl with ORPOTrainer so liger doesn't croak
* upgrade accelerate
* more fixes
* move patch for orpotrainer
* load the imports later
* remove use_logits_to_keep
* fix loss_type arg as a list
* fetch hf cache from s3
* just manually download the missing model for now
* lint for pre-commit update
* a few more missing models on disk
* fix: loss_type internally now list
* fix: remove deprecated code and raise deprecate
* fix: remove unneeded blocklist
* fix: remove reliance on transformers api to find package available
* chore: refactor shim for less sideeffect
* fix: silent trl experimental warning
---------
Co-authored-by: NanoCode012 <nano@axolotl.ai>
* scattermoe lora support
* fsdp, bf16, dim fixes
* expert weights aren't needed in save for bwd since they are frozen
* use sonicmoe optim options
* update save model from upstream
* fixes per code review feedback and add tests
* revert removal of CP fix
* misc fixes
* upgrade liger to 0.3.1
* update docs and example
* skip duplicate code check
* Update src/axolotl/integrations/liger/args.py
Co-authored-by: NanoCode012 <nano@axolotl.ai>
* Update README.md
Co-authored-by: NanoCode012 <nano@axolotl.ai>
* add logging
* chore: lint
* add test case
* upgrade liger and transformers
* also upgrade accelerate
* use kwargs to support patch release
* make sure prepared path is empty for test
* use transfromers 4.46.1 since 4.46.2 breaks fsdp
---------
Co-authored-by: NanoCode012 <nano@axolotl.ai>