ReLoRA implementation (with quantization) (#322)

* Experimental ReLoRA (+qlora) implementation * Add CPU offload * Remove local config * Fix saving logic * Remove redundant assert * Fix logic errors * Move ReLoRA into its own trainer class with a method override to create the proper scheduler * Formatting & typing fixes * Use safe_serialization * Don't allow fsdp/deepspeed with ReLoRA * Fix cpu-offload logic, enable multi gpu * Document parameters and add comment * Fix merge issue * Smooth over some sharp edges * Implement resume from checkpoint for relora * Address review comments * Fix saving logic * Add necessary metadata to safetensors --------- Co-authored-by: Wing Lian <wing.lian@gmail.com>
2023-08-23 20:07:18 -07:00
parent 55c23c7bcb
commit bde3c5a478
6 changed files with 491 additions and 20 deletions
--- a/README.md
+++ b/README.md
@@ -493,6 +493,12 @@ lora_modules_to_save:
 lora_out_dir:
 lora_fan_in_fan_out: false

+# ReLoRA configuration
+# must use either 'lora' or 'qlora' adapter, and does not support fsdp or deepspeed
+relora_steps: # number of steps per ReLoRA restart
+relora_warmup_steps: # number of per-restart warmup steps
+relora_cpu_offload: # true to perform lora weight merges on cpu during restarts, for modest gpu memory savings
+
 # wandb configuration if you're using it
 wandb_mode: # "offline" to save run metadata locally and not sync to the server, "disabled" to turn off wandb
 wandb_project: # your wandb project name