Wing Lian
cd079b5536
Tensor parallel w DeepSpeed AutoTP ( #2574 )
...
* support for deepspeed autotup
* bump to latest deepspeed that supports deepcompile too
* add deepcompile support too
* fix total steps calculation for TP
* setup fixture for tp
* update ds config to ensure weights are gathered for checkpoint
* fix duplicate validation names
* chore: lint
2025-07-14 21:33:48 -04:00
..
2025-04-11 09:51:59 -04:00
2025-07-14 21:33:48 -04:00
2025-07-14 20:11:11 -04:00
2023-12-12 09:39:22 -08:00
2025-06-03 14:04:15 -07:00
2025-03-21 11:02:43 -04:00
2025-07-09 09:22:35 -04:00
2025-07-11 09:34:19 +07:00
2025-07-14 10:05:26 -04:00
2025-03-31 13:40:12 +07:00
2025-07-14 10:05:26 -04:00
2024-12-02 08:47:10 -05:00
2025-04-05 01:25:44 -04:00
2025-06-23 23:08:46 -04:00
2025-03-31 13:40:12 +07:00
2025-06-10 19:53:07 -04:00
2025-03-21 11:02:43 -04:00
2025-06-10 19:53:07 -04:00
2025-03-21 11:02:43 -04:00
2024-03-14 11:05:42 -04:00
2025-05-23 15:51:11 -04:00
2025-05-23 15:51:11 -04:00
2025-07-14 09:25:44 -04:00
2025-06-27 11:02:51 -04:00
2025-06-05 07:20:33 -07:00
2025-03-29 08:30:06 -04:00
2025-03-21 11:02:43 -04:00
2025-06-03 14:04:15 -07:00
2024-08-22 11:46:57 -04:00
2025-03-21 11:02:43 -04:00
2025-05-23 15:51:11 -04:00
2025-07-14 10:05:26 -04:00
2025-07-09 09:22:35 -04:00