Files
Wing Lian 22810c97b7 use warmup_ratio as a better default than warmup steps since it's data dependent (#2897) [skip ci]
* use warmup_ratio as a better default than warmup steps since it's data dependent

* replace remainder of warmup_steps
2025-07-30 06:44:06 -04:00
..

Overview

This is an example of a Yi-34B-Chat configuration. It demonstrates that it is possible to finetune a 34B model on a GPU with 24GB of VRAM.

Tested on an RTX 4090 with python -m axolotl.cli.train examples/mistral/qlora.yml, a single epoch of finetuning on the alpaca dataset using qlora runs in 47 mins, using 97% of available memory.