VED
d0d26d5064
feat: Add GDPO Support (#3353)
* gdpo support - test left
* lint
* fixxes for vllm serv
* test advantages
* docss
* lint
* lint =
* gdpo simple + lint
* lint nit
* example
* lint
* trl 0.27.0
* blocklist
* test assert rmv
* add validation check for GDPO + sum_then_normalize
---------
Co-authored-by: Wing Lian <wing@axolotl.ai>
2026-01-21 17:22:45 -05:00
..
2025-09-10 20:27:00 -04:00
2025-12-22 13:58:25 -05:00
2025-09-12 10:55:50 +01:00
2025-09-17 10:38:15 +01:00
2025-07-30 06:44:06 -04:00
2025-07-30 06:44:06 -04:00
2025-07-30 06:44:06 -04:00
2025-09-23 21:22:15 +07:00
2025-07-30 06:44:06 -04:00
2025-07-30 06:44:06 -04:00
2025-07-30 06:44:06 -04:00
2025-07-30 06:44:06 -04:00
2025-07-30 06:44:06 -04:00
2025-07-30 06:44:06 -04:00
2025-07-30 06:44:06 -04:00
2025-10-22 19:16:55 -07:00
2026-01-21 17:22:45 -05:00
2025-07-30 06:44:06 -04:00
2025-07-30 06:44:06 -04:00
2025-07-30 06:44:06 -04:00
2025-07-30 06:44:06 -04:00
2025-07-30 06:44:06 -04:00
2024-04-18 14:28:03 -04:00
2025-07-30 06:44:06 -04:00