VED
d0d26d5064
feat: Add GDPO Support (#3353)
* gdpo support - test left
* lint
* fixxes for vllm serv
* test advantages
* docss
* lint
* lint =
* gdpo simple + lint
* lint nit
* example
* lint
* trl 0.27.0
* blocklist
* test assert rmv
* add validation check for GDPO + sum_then_normalize
---------
Co-authored-by: Wing Lian <wing@axolotl.ai>
2026-01-21 17:22:45 -05:00
..
2025-11-13 10:21:05 -05:00
2025-08-23 23:37:33 -04:00
2026-01-21 17:22:45 -05:00
2025-12-25 18:38:17 +07:00
2025-08-23 23:37:33 -04:00
2023-11-06 18:33:01 -05:00
2023-09-15 15:46:54 -04:00
2025-08-23 23:37:33 -04:00
2025-08-23 23:37:33 -04:00
2025-09-10 20:27:00 -04:00
2025-08-23 23:37:33 -04:00
2025-08-23 23:37:33 -04:00
2025-08-23 23:37:33 -04:00
2025-08-23 23:37:33 -04:00
2025-08-23 23:37:33 -04:00
2025-08-23 23:37:33 -04:00
2025-08-23 23:37:33 -04:00
2025-10-13 17:18:12 +07:00
2025-08-23 23:37:33 -04:00
2025-08-23 23:37:33 -04:00
2025-08-23 23:37:33 -04:00
2025-08-23 23:37:33 -04:00
2025-08-23 23:37:33 -04:00
2025-08-23 23:37:33 -04:00
2025-08-23 23:37:33 -04:00
2025-08-23 23:37:33 -04:00
2025-08-23 23:37:33 -04:00
2025-08-23 23:37:33 -04:00
2025-11-07 08:21:20 -05:00
2025-08-23 23:37:33 -04:00
2025-07-14 20:11:11 -04:00
2025-09-12 10:55:50 +01:00
2025-10-01 15:02:51 +07:00
2025-08-23 23:37:33 -04:00
2025-08-23 23:37:33 -04:00
2025-08-23 23:37:33 -04:00
2025-08-23 23:37:33 -04:00
2025-09-02 12:08:44 -04:00
2025-08-26 09:30:04 -04:00
2025-09-12 10:55:50 +01:00