Wing Lian
68f1b7004c
ScatterMoE LoRA support ( #3410 )
...
* scattermoe lora support
* fsdp, bf16, dim fixes
* expert weights aren't needed in save for bwd since they are frozen
* use sonicmoe optim options
* update save model from upstream
* fixes per code review feedback and add tests
* revert removal of CP fix
* misc fixes
2026-02-24 14:59:55 -05:00
..
2026-02-23 10:10:06 -05:00
2026-02-19 18:27:27 -05:00
2026-02-19 18:27:27 -05:00
2023-12-12 09:39:22 -08:00
2026-02-24 14:59:55 -05:00
2026-01-27 17:08:24 -05:00
2025-08-23 23:37:33 -04:00
2026-01-28 06:45:01 -05:00
2026-02-10 23:01:16 +07:00
2026-02-23 11:39:13 -05:00
2025-03-31 13:40:12 +07:00
2026-01-27 17:08:24 -05:00
2025-08-23 23:37:33 -04:00
2026-01-27 17:08:24 -05:00
2025-08-23 23:37:33 -04:00
2025-12-22 13:59:49 -05:00
2025-12-22 13:59:49 -05:00
2025-08-23 23:37:33 -04:00
2025-10-13 17:18:12 +07:00
2025-03-21 11:02:43 -04:00
2024-03-14 11:05:42 -04:00
2025-10-16 16:07:27 +07:00
2025-09-17 13:27:03 -04:00
2025-08-23 23:37:33 -04:00
2026-01-27 17:08:24 -05:00
2025-10-22 19:16:55 -07:00
2025-08-23 23:37:33 -04:00
2025-10-13 17:18:12 +07:00
2025-09-02 12:08:44 -04:00
2026-01-27 17:08:24 -05:00
2025-08-23 23:37:33 -04:00
2024-08-22 11:46:57 -04:00
2025-08-23 23:37:33 -04:00
2025-09-10 20:27:00 -04:00
2026-01-27 17:08:24 -05:00
2025-07-14 10:05:26 -04:00
2025-09-17 13:27:03 -04:00
2025-12-19 10:43:47 -05:00