add float16 docs and tweak typehints

2023-06-15 00:26:44 -04:00
parent 6f849809c5
commit 88e17ffc50
2 changed files with 13 additions and 3 deletions
--- a/README.md
+++ b/README.md
@@ -264,6 +264,8 @@ See sample configs in [configs](configs) folder or [examples](examples) for quic
  bf16: true # require >=ampere
  fp16: true
  tf32: true # require >=ampere
+  bfloat16: true # require >=ampere, use instead of bf16 when you don't want AMP
+  float16: true # use instead of fp16 when you don't want AMP
  ```
  Note: Repo does not do 4-bit quantization.

@@ -522,6 +524,12 @@ Add below flag to train command above
 --merge_lora --lora_model_dir="./completed-model" --load_in_8bit=False --load_in_4bit=False
 ```

+If you run out of CUDA memory, you can try to merge in system RAM with
+
+```bash
+CUDA_VISIBLE_DEVICES="" python3 scripts/finetune.py ...
+```
+
 ## Common Errors 🧰

 > Cuda out of memory