Mixtral fixes 20240124 (#1192) [skip ci]
* mixtral nccl fixes * make sure to patch for z3
This commit is contained in:
@@ -62,7 +62,7 @@ evals_per_epoch: 4
|
||||
eval_table_size:
|
||||
saves_per_epoch: 1
|
||||
debug:
|
||||
deepspeed: #deepspeed/zero2.json # multi-gpu only
|
||||
deepspeed: #deepspeed_configs/zero2.json # multi-gpu only
|
||||
weight_decay: 0.1
|
||||
fsdp:
|
||||
fsdp_config:
|
||||
|
||||
@@ -942,7 +942,7 @@
|
||||
"not only optimizer states but also gradients and parameters across GPUs. The bf16 indicate mixed precision training using bfloat16.\n",
|
||||
"For more information read axolotl's readme\n",
|
||||
"\"\"\"\n",
|
||||
"!accelerate launch -m axolotl.cli.train /folder/config.yml --deepspeed deepspeed/zero3_bf16.json"
|
||||
"!accelerate launch -m axolotl.cli.train /folder/config.yml --deepspeed deepspeed_configs/zero3_bf16.json"
|
||||
]
|
||||
}
|
||||
],
|
||||
|
||||
@@ -65,7 +65,7 @@ eval_table_max_new_tokens: 128
|
||||
saves_per_epoch: 1
|
||||
debug:
|
||||
#default deepspeed, can use more aggresive if needed like zero2, zero3
|
||||
deepspeed: deepspeed/zero1.json
|
||||
deepspeed: deepspeed_configs/zero1.json
|
||||
weight_decay: 0.0
|
||||
fsdp:
|
||||
fsdp_config:
|
||||
|
||||
@@ -8,5 +8,5 @@ accelerate launch -m axolotl.cli.train examples/mistral/config.yml
|
||||
|
||||
If you run into CUDA OOM, use deepspeed with config zero2.json:
|
||||
```shell
|
||||
accelerate launch -m axolotl.cli.train examples/mistral/config.yml --deepspeed deepspeed/zero2.json
|
||||
accelerate launch -m axolotl.cli.train examples/mistral/config.yml --deepspeed deepspeed_configs/zero2.json
|
||||
```
|
||||
|
||||
@@ -84,7 +84,7 @@ eval_table_size:
|
||||
eval_table_max_new_tokens: 128
|
||||
saves_per_epoch: 1
|
||||
debug:
|
||||
deepspeed: deepspeed/zero2.json
|
||||
deepspeed: deepspeed_configs/zero2.json
|
||||
weight_decay: 0.0
|
||||
fsdp:
|
||||
fsdp_config:
|
||||
|
||||
@@ -3,7 +3,7 @@
|
||||
Due to some nuances with the phi code, please use deepspeed when training phi for full finetune.
|
||||
|
||||
```shell
|
||||
accelerate launch -m axolotl.cli.train examples/phi/phi-ft.yml --deepspeed deepspeed/zero1.json
|
||||
accelerate launch -m axolotl.cli.train examples/phi/phi-ft.yml --deepspeed deepspeed_configs/zero1.json
|
||||
|
||||
# OR
|
||||
|
||||
|
||||
Reference in New Issue
Block a user