Commit Graph

7 Commits

Author SHA1 Message Date
Wing Lian
8b79ff0e94 fix eval_steps to be a sane default (#797)
* fix eval_steps to be a sane default

* update docs for fractional eval_steps
2023-10-27 22:36:30 -04:00
Wing Lian
9b43e7ea15 disable eval table w sample packing in examples (#778) 2023-10-23 09:18:44 -04:00
Wing Lian
2d8def68dc simplify by removing duplicate base_model_config (#772) 2023-10-23 01:42:38 -04:00
atgctg
ace70b33c6 Fix: lowercase True values in config (#713)
* Fix: lowercase `True` values in config

* Fix: lowercase `True` values in config
2023-10-10 21:32:20 +09:00
lukemarsden
295b2662e1 Get qlora mistral-7b fine tuning working on a single 4090 (#708) 2023-10-10 15:14:23 +09:00
NanoCode012
669f1d052c Fix: Higher vram usage for mistral and sample_packing (#691)
* Fix: Higher vram usage for mistral and sample_packing

* chore: update comment

* chore: lint
2023-10-06 12:33:43 -04:00
Abhishek Mishra
d4a88e4eca Adding qlora config for Mistral (#675)
* Adding qlora config for Mistral

Contains fix for Mistral FA issue - ValueError: You are attempting to perform batched generation with padding_side='right' this may lead to unexpected behaviour for Flash Attention version of Mistral. Make sure to  call tokenizer.padding_side  = 'left' before tokenizing the input.

Fix for now is to set sample_packing: true and pad_to_sequence_len: true

* Renamed to qlora.yml
2023-10-06 21:05:56 +09:00