jagged lr restart scheudler (#1680) [skip ci]

* jagged lr restart scheudler var name fix make sure to create scheduler first * wire things together * more fixes * fix for nesting scheduler and first anneal phase * no need for relora trainer anymore since we've generalized the relora scheduler * remove redundant relora scheduler and lint * update relora e2e test for updated params * need restart steps for relora test * update quarto docs for dropped relora trainer * update example yaml * drop verbose arg * min lr scale support for jagged lr * don't let min_lr be nonetype * cleanup args
2025-07-31 13:50:03 -04:00
parent 32a7890231
commit 7b68dfafd7
15 changed files with 139 additions and 137 deletions
--- a/examples/llama-2/relora.yml
+++ b/examples/llama-2/relora.yml
@@ -25,9 +25,12 @@ lora_alpha: 16
 lora_dropout: 0.05
 lora_target_linear: true

-relora_steps: 150
-relora_warmup_ratio: 0.1
+relora: true
+relora_prune_ratio: 0.9
 relora_cpu_offload: false
+jagged_restart_steps: 150
+jagged_restart_warmup_steps: 10
+jagged_restart_anneal_steps: false

 wandb_project:
 wandb_entity: