axolotl

Author	SHA1	Message	Date
kallewoof	58ec8b1113	feature: loss watchdog for terminating training runs that are failing (#899 ) Co-authored-by: Karl-Johan Alm <kalle@gmail.com>	2023-12-04 07:54:34 -05:00
Wing Lian	f544ab2bed	don't compile deepspeed or bitsandbytes from source (#837 )	2023-11-08 19:49:55 -05:00
Wing Lian	8b79ff0e94	fix eval_steps to be a sane default (#797 ) * fix eval_steps to be a sane default * update docs for fractional eval_steps	2023-10-27 22:36:30 -04:00
Wing Lian	9b43e7ea15	disable eval table w sample packing in examples (#778 )	2023-10-23 09:18:44 -04:00
Wing Lian	2d8def68dc	simplify by removing duplicate base_model_config (#772 )	2023-10-23 01:42:38 -04:00
atgctg	ace70b33c6	Fix: lowercase `True` values in config (#713 ) * Fix: lowercase `True` values in config * Fix: lowercase `True` values in config	2023-10-10 21:32:20 +09:00
lukemarsden	295b2662e1	Get qlora mistral-7b fine tuning working on a single 4090 (#708 )	2023-10-10 15:14:23 +09:00
NanoCode012	669f1d052c	Fix: Higher vram usage for mistral and sample_packing (#691 ) * Fix: Higher vram usage for mistral and sample_packing * chore: update comment * chore: lint	2023-10-06 12:33:43 -04:00
Abhishek Mishra	d4a88e4eca	Adding qlora config for Mistral (#675 ) * Adding qlora config for Mistral Contains fix for Mistral FA issue - ValueError: You are attempting to perform batched generation with padding_side='right' this may lead to unexpected behaviour for Flash Attention version of Mistral. Make sure to call tokenizer.padding_side = 'left' before tokenizing the input. Fix for now is to set sample_packing: true and pad_to_sequence_len: true * Renamed to qlora.yml	2023-10-06 21:05:56 +09:00

9 Commits