jagged lr restart scheudler (#1680) [skip ci]

* jagged lr restart scheudler var name fix make sure to create scheduler first * wire things together * more fixes * fix for nesting scheduler and first anneal phase * no need for relora trainer anymore since we've generalized the relora scheduler * remove redundant relora scheduler and lint * update relora e2e test for updated params * need restart steps for relora test * update quarto docs for dropped relora trainer * update example yaml * drop verbose arg * min lr scale support for jagged lr * don't let min_lr be nonetype * cleanup args
2025-07-31 13:50:03 -04:00
parent 32a7890231
commit 7b68dfafd7
15 changed files with 139 additions and 137 deletions
--- a/tests/e2e/solo/test_relora_llama.py
+++ b/tests/e2e/solo/test_relora_llama.py
@@ -34,9 +34,10 @@ class TestReLoraLlama(unittest.TestCase):
                "lora_alpha": 16,
                "lora_dropout": 0.05,
                "lora_target_modules": ["q_proj", "v_proj"],
-                "relora_steps": 50,
-                "relora_warmup_steps": 10,
-                "relora_anneal_steps": 10,
+                "relora": True,
+                "jagged_restart_steps": 50,
+                "jagged_restart_warmup_steps": 10,
+                "jagged_restart_anneal_steps": 10,
                "relora_prune_ratio": 0.9,
                "relora_cpu_offload": True,
                "val_set_size": 0.0,