axolotl

Files

Abhishek Mishra d4a88e4eca Adding qlora config for Mistral (#675 )

* Adding qlora config for Mistral

Contains fix for Mistral FA issue - ValueError: You are attempting to perform batched generation with padding_side='right' this may lead to unexpected behaviour for Flash Attention version of Mistral. Make sure to  call tokenizer.padding_side  = 'left' before tokenizing the input.

Fix for now is to set sample_packing: true and pad_to_sequence_len: true

* Renamed to qlora.yml

2023-10-06 21:05:56 +09:00

config.yml

prepared dataset caching, other misc fixes (#665 )

2023-10-02 21:07:24 -04:00

qlora.yml

Adding qlora config for Mistral (#675 )

2023-10-06 21:05:56 +09:00

README.md

Update mistral/README.md (#647 )

2023-09-28 10:24:56 -04:00

README.md

Mistral 7B is a language model with a total of 7.3 billion parameters, showcasing a notable performance across a variety of benchmarks.

Fine Tune:

accelerate launch -m axolotl.cli.train examples/mistral/config.yml

If you run into CUDA OOM, use deepspeed with config zero2.json:

accelerate launch -m axolotl.cli.train examples/mistral/config.yml --deepspeed deepspeed/zero2.json