Files

Wing Lian 22810c97b7 use warmup_ratio as a better default than warmup steps since it's data dependent (#2897 ) [skip ci]

* use warmup_ratio as a better default than warmup steps since it's data dependent

* replace remainder of warmup_steps

2025-07-30 06:44:06 -04:00

lora.yml

2025-07-30 06:44:06 -04:00

qlora.yml

2025-07-30 06:44:06 -04:00

qwen2-moe-lora.yaml

2025-07-30 06:44:06 -04:00

qwen2-moe-qlora.yaml

2025-07-30 06:44:06 -04:00

README.md

2025-07-12 11:39:51 -04:00

Qwen

TODO

Qwen2 MoE

✅ multipack ✅ qwen2_moe 4-bit QLoRA ✅ qwen2_moe 16-bit LoRA ❓ qwen2_moe 8-bit LoRA