EBFT: Matching Features, Not Tokens: Energy-Based Fine-Tuning of Language Models (#3527) [skip ci]

* EBFT wip * fixes * more fixeS * add missing strided module * ebft fixes for multi-turn * make ebft work with async * add example for ebft w qwen3.5 * fix for split thinking and update yaml for lora over linear attention only * enforce_eager for vllm arg in schema * fix sync weights * fix multi-gpu * handle updated sig for mm * ddp fixes * improve multi-gpu handling, don't calculate logits, adaptive completion length * chore: lint * chore: lint * support completion_mean * Address corereview feedback * clamp min IS ratio * Address PR code review * more fixes identified * address code review * Fix property from rebase conflict
2026-03-24 18:43:46 -04:00
parent e9883c91d4
commit c50c4acbf4
48 changed files with 5885 additions and 168 deletions
--- a/examples/ebft/ebft_opencode.py
+++ b/examples/ebft/ebft_opencode.py
@@ -0,0 +1,28 @@
+"""
+Dataset transform for nvidia/OpenCodeInstruct with EBFT.
+
+Maps the dataset's `input` (prompt) and `output` (code solution) fields
+to the format expected by the EBFT trainer.
+"""
+
+
+def transform(cfg, *args, **kwargs):
+    def transform_fn(example, tokenizer=None):
+        return {
+            "prompt": [
+                {"role": "user", "content": example["input"]},
+            ],
+            "ground_truth": example["output"],
+        }
+
+    return transform_fn, {
+        "remove_columns": [
+            "id",
+            "domain",
+            "generation_algorithm",
+            "llm_judgement",
+            "unit_tests",
+            "tests_execution_status",
+            "average_test_score",
+        ]
+    }