Seung Hyun Cho
3e51a680c2
fix: Fix evaluation loss in KD trainer (#3271)
* fix: Fix evaluation loss in KD trainer
* Fix v2 strategy super() call
* fix: Add safety check for total_tokens in log method
* fix: simplified num items and outputs return handling
* fix: add missing model forward pass in compute_loss
* refactor: Use Template Method pattern for chat template strategies
* refactor: use pop(None) and remove v2 override
* chore: lint
---------
Co-authored-by: NanoCode012 <nano@axolotl.ai>
Co-authored-by: Wing Lian <wing@axolotl.ai>
2025-12-17 13:40:36 -05:00
..
2025-08-23 23:37:33 -04:00
2025-11-06 16:06:03 -05:00
2025-11-13 10:21:05 -05:00
2023-12-12 09:39:22 -08:00
2025-12-17 13:40:36 -05:00
2025-09-25 12:03:50 -04:00
2025-08-23 23:37:33 -04:00
2025-11-11 09:04:28 +07:00
2025-11-18 11:35:25 +07:00
2025-11-13 10:21:05 -05:00
2025-03-31 13:40:12 +07:00
2025-11-18 11:35:25 +07:00
2025-08-23 23:37:33 -04:00
2025-08-23 23:37:33 -04:00
2025-08-23 23:37:33 -04:00
2025-09-02 12:08:44 -04:00
2025-10-13 17:18:12 +07:00
2025-08-23 23:37:33 -04:00
2025-10-13 17:18:12 +07:00
2025-03-21 11:02:43 -04:00
2024-03-14 11:05:42 -04:00
2025-10-16 16:07:27 +07:00
2025-09-17 13:27:03 -04:00
2025-08-23 23:37:33 -04:00
2025-10-10 14:44:25 +01:00
2025-10-22 19:16:55 -07:00
2025-08-23 23:37:33 -04:00
2025-10-13 17:18:12 +07:00
2025-09-02 12:08:44 -04:00
2025-08-23 23:37:33 -04:00
2025-08-23 23:37:33 -04:00
2024-08-22 11:46:57 -04:00
2025-08-23 23:37:33 -04:00
2025-09-10 20:27:00 -04:00
2025-05-23 15:51:11 -04:00
2025-07-14 10:05:26 -04:00
2025-09-17 13:27:03 -04:00
2025-08-23 23:37:33 -04:00