VED
|
d0d26d5064
|
feat: Add GDPO Support (#3353)
* gdpo support - test left
* lint
* fixxes for vllm serv
* test advantages
* docss
* lint
* lint =
* gdpo simple + lint
* lint nit
* example
* lint
* trl 0.27.0
* blocklist
* test assert rmv
* add validation check for GDPO + sum_then_normalize
---------
Co-authored-by: Wing Lian <wing@axolotl.ai>
|
2026-01-21 17:22:45 -05:00 |
|