axolotl

Files

Wing Lian af8d257aa2 make pad_to_sequence_len default to the same value as sample_packing (#2941 ) [skip ci]

* make pad_to_sequence_len default to the same value as sample_packing

* remove duplicate validation

* fix test

* update description meta

Co-authored-by: NanoCode012 <nano@axolotl.ai>

---------

Co-authored-by: NanoCode012 <nano@axolotl.ai>

2025-07-21 11:40:56 -04:00

bigstral-ds-zero3.yaml

make pad_to_sequence_len default to the same value as sample_packing (#2941 ) [skip ci]

2025-07-21 11:40:56 -04:00

config.yml

make pad_to_sequence_len default to the same value as sample_packing (#2941 ) [skip ci]

2025-07-21 11:40:56 -04:00

lora-mps.yml

make pad_to_sequence_len default to the same value as sample_packing (#2941 ) [skip ci]

2025-07-21 11:40:56 -04:00

lora.yml

make pad_to_sequence_len default to the same value as sample_packing (#2941 ) [skip ci]

2025-07-21 11:40:56 -04:00

mistral-dpo-qlora.yml

make pad_to_sequence_len default to the same value as sample_packing (#2941 ) [skip ci]

2025-07-21 11:40:56 -04:00

mistral-qlora-fsdp.yml

checkpoint model on first step callback (#2906 )

2025-07-15 15:00:48 -04:00

mistral-qlora-orpo.yml

make pad_to_sequence_len default to the same value as sample_packing (#2941 ) [skip ci]

2025-07-21 11:40:56 -04:00

mistral-small-3.1-24B-lora.yml

checkpoint model on first step callback (#2906 )

2025-07-15 15:00:48 -04:00

mixtral_22.yml

make pad_to_sequence_len default to the same value as sample_packing (#2941 ) [skip ci]

2025-07-21 11:40:56 -04:00

mixtral-8x22b-qlora-fsdp.yml

checkpoint model on first step callback (#2906 )

2025-07-15 15:00:48 -04:00

mixtral-qlora-fsdp.yml

checkpoint model on first step callback (#2906 )

2025-07-15 15:00:48 -04:00

mixtral.yml

make pad_to_sequence_len default to the same value as sample_packing (#2941 ) [skip ci]

2025-07-21 11:40:56 -04:00

qlora.yml

make pad_to_sequence_len default to the same value as sample_packing (#2941 ) [skip ci]

2025-07-21 11:40:56 -04:00

README.md

Mixtral fixes 20240124 (#1192 ) [skip ci]

2024-01-24 14:59:57 -05:00

README.md

Mistral 7B is a language model with a total of 7.3 billion parameters, showcasing a notable performance across a variety of benchmarks.

Fine Tune:

accelerate launch -m axolotl.cli.train examples/mistral/config.yml

If you run into CUDA OOM, use deepspeed with config zero2.json:

accelerate launch -m axolotl.cli.train examples/mistral/config.yml --deepspeed deepspeed_configs/zero2.json