Wing Lian
328d598114
gemma3 packing fixes ( #2449 )
...
* make gemma3 work with packing
* multi-gpu e2e for ci
* update gemma3 model namespace to use mirror
* add gradient checkpointing to multigpu e2e ci
* update gemma3 examples for use_reentrant and fix ddp find unused params
* fix tests for gemma3
* fix import for test utils
* set correct train loss for gemma3 e2e
2025-03-31 17:15:23 -04:00
..
2025-01-30 11:45:56 -05:00
2025-01-30 11:34:02 -05:00
2024-12-17 11:24:30 -05:00
2025-03-26 18:13:51 -04:00
2024-12-10 16:25:25 -05:00
2024-12-17 11:24:30 -05:00
2025-02-18 09:59:27 +07:00
2024-12-17 11:24:30 -05:00
2024-12-17 11:24:30 -05:00
2025-02-18 09:59:27 +07:00
2025-03-31 17:15:23 -04:00
2024-12-17 11:24:30 -05:00
2025-02-18 09:59:27 +07:00
2024-12-17 11:24:30 -05:00
2025-01-30 11:45:56 -05:00
2025-03-31 15:48:20 -04:00
2024-12-17 11:24:30 -05:00
2025-03-23 11:08:51 -04:00
2024-12-17 11:24:30 -05:00
2025-03-23 11:08:51 -04:00
2024-12-17 11:24:30 -05:00
2024-12-17 11:24:30 -05:00
2025-02-18 09:59:27 +07:00
2025-03-23 11:08:51 -04:00
2024-12-17 11:24:30 -05:00
2024-12-17 11:24:30 -05:00
2024-12-17 11:24:30 -05:00
2025-02-18 09:59:27 +07:00
2025-03-23 11:08:51 -04:00
2024-12-17 11:24:30 -05:00
2024-12-17 11:24:30 -05:00
2024-12-17 11:24:30 -05:00
2024-12-17 11:24:30 -05:00
2025-01-30 11:45:56 -05:00
2024-12-17 11:24:30 -05:00
2024-12-17 11:24:30 -05:00