Andrew Wu
90090fa9e8
DPO support loss types ( #3566 )
...
* Support loss_type/loss_weights DPO
* Validate dpo loss type/weights only set for dpo
* Tests: Update ipo tests to use new path
* Docs: Update docs for new ipo path
* PR fixes - typo/validation
* PR nit - warning
* chore: fix warnings arg
---------
Co-authored-by: NanoCode012 <nano@axolotl.ai >
2026-04-23 00:25:28 -04:00
..
2026-04-21 10:16:03 -04:00
2026-03-23 02:26:10 -04:00
2026-03-31 19:15:59 -04:00
2026-04-21 10:16:03 -04:00
2026-03-28 13:15:54 -04:00
2023-11-06 18:33:01 -05:00
2023-09-15 15:46:54 -04:00
2026-01-27 17:08:24 -05:00
2026-03-21 22:46:10 -04:00
2026-03-24 15:40:05 -04:00
2026-04-23 00:25:28 -04:00
2026-01-27 17:08:24 -05:00
2025-08-23 23:37:33 -04:00
2025-08-23 23:37:33 -04:00
2026-01-27 17:08:24 -05:00
2026-01-27 17:08:24 -05:00
2025-08-23 23:37:33 -04:00
2026-01-27 17:08:24 -05:00
2026-01-27 17:08:24 -05:00
2026-03-16 23:47:00 -04:00
2025-08-23 23:37:33 -04:00
2025-08-23 23:37:33 -04:00
2026-01-27 17:08:24 -05:00
2025-08-23 23:37:33 -04:00
2026-03-21 22:46:10 -04:00
2026-03-21 22:46:10 -04:00
2025-08-23 23:37:33 -04:00
2025-08-23 23:37:33 -04:00
2025-11-07 08:21:20 -05:00
2025-08-23 23:37:33 -04:00
2025-07-14 20:11:11 -04:00
2026-03-05 13:40:45 -05:00
2026-04-02 10:18:00 -04:00
2025-08-23 23:37:33 -04:00
2025-08-23 23:37:33 -04:00
2026-01-27 17:08:24 -05:00
2025-08-23 23:37:33 -04:00
2026-02-10 17:44:17 +07:00
2025-08-26 09:30:04 -04:00
2026-03-22 13:54:03 -04:00