From 94d03c8402ec86e60487ca0f979c0c79927ab665 Mon Sep 17 00:00:00 2001 From: NanoCode012 Date: Fri, 11 Aug 2023 11:27:42 +0900 Subject: [PATCH] Clarify pre-tokenize before multigpu (#359) --- README.md | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index bbba22b8f..067ef0d05 100644 --- a/README.md +++ b/README.md @@ -524,7 +524,14 @@ Run accelerate launch scripts/finetune.py configs/your_config.yml ``` -#### Multi-GPU Config +#### Multi-GPU + +It is recommended to pre-tokenize dataset with the following before finetuning: +```bash +CUDA_VISIBLE_DEVICES="" accelerate ... --prepare_ds_only +``` + +##### Config - llama FSDP ```yaml