axolotl

Files

Wing Lian 68b227a7d8 Mixtral multipack (#928 )

* mixtral multipack

* use mixtral model

* sample yml

* calculate cu_seqlens properly

* use updated flash ettention setting

* attn var checks

* force use of flash attention 2 for packing

* lint

* disable future fix for now

* update support table

2023-12-09 21:26:30 -05:00

config.yml

Feat(wandb): Refactor to be more flexible (#767 )

2023-12-04 22:17:25 +09:00

mixtral.yml

Mixtral multipack (#928 )

2023-12-09 21:26:30 -05:00

qlora.yml

Feat(wandb): Refactor to be more flexible (#767 )

2023-12-04 22:17:25 +09:00

README.md

Update mistral/README.md (#647 )

2023-09-28 10:24:56 -04:00

README.md

Mistral 7B is a language model with a total of 7.3 billion parameters, showcasing a notable performance across a variety of benchmarks.

Fine Tune:

accelerate launch -m axolotl.cli.train examples/mistral/config.yml

If you run into CUDA OOM, use deepspeed with config zero2.json:

accelerate launch -m axolotl.cli.train examples/mistral/config.yml --deepspeed deepspeed/zero2.json