Wing Lian
29fa4dedbb
Gemma4 fixes and profiler ( #3591 )
2026-04-10 16:46:17 -04:00
NanoCode012
7daf7d96f1
fix: regex for unfrozen language tower ( #3586 ) [skip ci]
...
* fix: regex for unfrozen language tower
* fix: other leftover regex
2026-04-08 08:18:11 -07:00
VED
9e64c76326
qwen3.5 configs ( #3554 ) [skip ci]
...
* qwen3.5 configs
* update shared experts readme
2026-04-01 09:19:31 -04:00
Wing Lian
e412370877
roundup_power2_divisions not needed with newer pytorch versions ( #3540 )
...
* roundup_power2_divisions not needed with newer pytorch versions
* remove typo
* update qwen3.5 moe 35b-a3b yaml for 5090
* more bug fixes
* fix tests to match updated trainer
* don't use fa2 for hooks test
* reset plugins on the instance
* retry download
* fix references to renamed axolotl_cfg property on trainer
* Fix ref to trainer cfg
2026-03-24 15:40:05 -04:00
Owen Arliawan
c57acef2c7
Qwen3.5-MoE example config with lora_target_modules regex ( #3515 ) [skip ci]
...
* lora target modules with regex
* updates
* fsdp for non moe
* update wording
* chore: cleanup and lint
* chore: cleanup docs from merge
---------
Co-authored-by: NanoCode012 <nano@axolotl.ai >
2026-03-20 16:52:46 +07:00
VED
113d275bd9
qwen docs + new config ( #3499 ) [skip ci]
...
* qwen docs + new config
* docss lint
* simplify comments
* read me
* lint comments
* Update docs/multimodal.qmd
* Update docs/multimodal.qmd
* Update examples/qwen3.5/9b-fft-vision.yaml
* chore: fix link and incorrect points
---------
Co-authored-by: NanoCode012 <kevinvong@rocketmail.com >
Co-authored-by: NanoCode012 <nano@axolotl.ai >
2026-03-20 16:13:34 +07:00
VED
c119382337
add: qwen 3.5 ( #3442 )
...
* add: qwen 3.5
* test for qwen , patch
* lint
* qwen3 fix on main
* Apply suggestions from code review
Co-authored-by: NanoCode012 <kevinvong@rocketmail.com >
* moe config
* config moe
* configs and chore
* Update examples/qwen3.5/122b-a10b-moe-qlora.yaml
Co-authored-by: NanoCode012 <kevinvong@rocketmail.com >
* Update examples/qwen3.5/35b-a3b-moe-qlora.yaml
Co-authored-by: NanoCode012 <kevinvong@rocketmail.com >
* chore for qwen + vlm patch
* chore lint
* qwen lint
* 3_5_moe
* Update examples/qwen3.5/README.md
---------
Co-authored-by: NanoCode012 <kevinvong@rocketmail.com >
2026-03-06 09:31:00 -05:00