axolotl

Files

Wing Lian 9b6ee83a73 FDSP + QLoRA (#1378 )

* wip qlora + fsdp fixes

* more fixes

* make sure to load the lora 🤦

* only setup quantized meta on non-zero rank:

* only run setup_quantized_peft_meta_for_training for qlora+fsdp

* more fixes for qlora+fsdp

* chore: lint

* add example yml

* support mistral too

* fix for model_type and add mixtral support too

* set cpu_offload: false to reduce vram, constrain new accleerator logic to qlora + fsdp

* refactor for duplicate code

2024-03-08 14:31:01 -05:00

Mistral-7b-example

fix(examples): remove is_*_derived as it's parsed automatically (#1297 )

2024-02-22 00:52:46 +09:00

config.yml

fix(examples): remove is_*_derived as it's parsed automatically (#1297 )

2024-02-22 00:52:46 +09:00

lora-mps.yml

Mps mistral lora (#1292 ) [skip ci]

2024-02-26 22:39:57 -05:00

mixtral-qlora-fsdp.yml

FDSP + QLoRA (#1378 )

2024-03-08 14:31:01 -05:00

mixtral.yml

Add seq2seq eval benchmark callback (#1274 )

2024-02-13 08:24:30 -08:00

qlora.yml

fix(examples): remove is_*_derived as it's parsed automatically (#1297 )

2024-02-22 00:52:46 +09:00

README.md

Mixtral fixes 20240124 (#1192 ) [skip ci]

2024-01-24 14:59:57 -05:00

README.md

Mistral 7B is a language model with a total of 7.3 billion parameters, showcasing a notable performance across a variety of benchmarks.

Fine Tune:

accelerate launch -m axolotl.cli.train examples/mistral/config.yml

If you run into CUDA OOM, use deepspeed with config zero2.json:

accelerate launch -m axolotl.cli.train examples/mistral/config.yml --deepspeed deepspeed_configs/zero2.json