feat: Add GDPO Support (#3353)

* gdpo support - test left

* lint

* fixxes for vllm serv

* test advantages

* docss

* lint

* lint =

* gdpo simple + lint

* lint nit

* example

* lint

* trl 0.27.0

* blocklist

* test assert rmv

* add validation check for GDPO + sum_then_normalize

---------

Co-authored-by: Wing Lian <wing@axolotl.ai>
This commit is contained in:
VED
2026-01-22 03:52:45 +05:30
committed by GitHub
parent 8623dd8a72
commit d0d26d5064
11 changed files with 742 additions and 6 deletions

View File

@@ -311,7 +311,6 @@ class TestHFRLTrainerBuilder:
# KTO specific
assert training_arguments.desirable_weight == 1.0
assert training_arguments.undesirable_weight == 1.0
assert training_arguments.max_prompt_length == 512
def _write_rewards_file(self, rewards_dir: Path):
"""