axolotl

Author	SHA1	Message	Date
Jan Philipp Harries	3392270544	experimental llama 2 chat support (#296 ) * experimental llama 2 chat support * few small fixes * llama2_chat * small fix to follow original implementation * small fixes and added fixtures/tests * fix -mixed up inference and finetuning conversations * args - small fix * small fix * small adjustment and warning * fix with pre-commit --------- Co-authored-by: Jan Philipp Harries <jpdus@users.noreply.github.com>	2023-08-06 17:40:52 -04:00
Wing Lian	bb53a165f5	add a basic ds zero3 config (#347 ) better defaults for ds	2023-08-06 17:19:51 -04:00
ssmi153	10405b9995	Update XFormers Attention Monkeypatch to handle Llama-2 70B (GQA) (#339 ) * Fix XFormers attention for Llama-2 70B (GQA) Updated XFormers MonkeyPatch to handle GQA as used in Llama-2 70B. All the updated code is taken directly from the Transformers library: `07360b6c9c (diff-06392bad3b9e97be9ade60d4ac46f73b6809388f4d507c2ba1384ab872711c51)` from their llama_modeling.py file. * Catch configs without pretraining_tp * Whitespace bug fix Command had accidentally been moved out of if-else block. * pre-commit formatting fixes Thanks to @winglian	2023-08-06 11:09:04 -04:00
Jan Philipp Harries	c93655c0a3	Added Orca Mini prompt strategy (#263 ) * added Orca Mini prompt strategy * maybe this fixed precommit errors? * pre-commits passing --------- Co-authored-by: Jan Philipp Harries <jpdus@users.noreply.github.com>	2023-08-06 03:16:41 +09:00
Wing Lian	fe285430bc	optimize the iteration when tokenizeing large datasets (#332 )	2023-08-04 12:12:05 -04:00
Aman Gupta Karmani	0d2e34f056	Merge pull request #336 from tmm1/flash-attn Fix flash-attn + qlora not working with llama models	2023-08-03 16:25:30 -07:00
Aman Gupta Karmani	b56a6c0101	Merge pull request #337 from tmm1/readme-fix update README	2023-08-03 15:14:17 -07:00
Aman Karmani	2eda9e02a9	fix typo	2023-08-03 21:04:12 +00:00
Aman Karmani	78b9efb7f4	scope flash-attn+qlora fix correctly, scope to llama, add comment	2023-08-03 19:19:39 +00:00
Aman Karmani	312a9fad07	move flash-attn monkey patch alongside the others	2023-08-03 17:20:49 +00:00
Aman Karmani	58d665943e	python 3.10 and 3.11 both work fine, as does pytorch 2.1.0.dev	2023-08-03 16:47:25 +00:00
Aman Karmani	cc7e80026e	there is no configs folder	2023-08-03 16:31:37 +00:00
mhenrichsen	dc71d8872a	feat/llama-2 examples (#319 ) * qlora llama-2 * qlora llama-2 * linting * readme * lora added * linting * change group_by_length * 13b fitting on 24gb * grouped lengths true * add pad token * change out dir --------- Co-authored-by: Mads Henrichsen <mads@Brbar-tilhrende-Mads.local>	2023-08-03 19:22:48 +09:00
Aman Karmani	248bf90f89	ensure flash-attn fixes happen in both adapter/lora modes, and use torch_dtype	2023-08-02 20:15:03 +00:00
Wing Lian	77085ea24e	qlora w flash attention fixes (#333 )	2023-08-01 23:26:16 -04:00
Wing Lian	db2a3586f3	add peft install back since it doesn't get installed by setup.py (#331 )	2023-07-31 16:31:53 -04:00
Wing Lian	6c9a87c8ee	pin accelerate so it works with llama2 (#330 )	2023-07-30 22:20:06 -04:00
Wing Lian	894cba09f3	fix FSDP save of final model (#329 )	2023-07-30 21:46:44 -04:00
Wing Lian	41a4d15d43	update README for updated docker images (#328 ) * update README for updated docker images * update readme from pr feedback	2023-07-28 16:50:03 -04:00
Wing Lian	2c37bf6c21	Prune cuda117 (#327 ) * drop cuda117/torch 1.13.1 from support, pin flash attention to v2.0.1, rm torchvision/torchaudio install * gptq base build not needed. add sm 9.0 support	2023-07-26 16:27:49 -04:00
Wing Lian	9f69c4d8c1	latest HEAD of accelerate causes 0 loss immediately w FSDP (#321 )	2023-07-24 11:23:56 -04:00
Wing Lian	3d4984b9a5	update prompts for open orca to match the paper (#317 ) fix the test for the updated system tokenizer	2023-07-22 13:49:11 -04:00
Wing Lian	ff7f18d1ed	disable gh cache for first step of docker builds too	2023-07-22 11:46:37 -04:00
Wing Lian	cf62cfd661	add runpod envs to .bashrc, fix bnb env (#316 ) * hopper support for base dockerfile, add runpod envs to .bashrc * set BNB_CUDA_VERSION env for latest bnb * don't support hopper yet w 118	2023-07-22 10:09:38 -04:00
Wing Lian	c5df969262	don't use the gha cache w docker	2023-07-22 08:46:21 -04:00
Wing Lian	40a53ff181	Merge pull request #307 from OpenAccess-AI-Collective/xgen-user-sharegpt-tokens better handling since xgen tokenizer breaks with convert_tokens_to_ids	2023-07-22 04:10:38 -04:00
Wing Lian	dcdec44347	Merge pull request #306 from ethanhs/xgen Add XGen info to README and example config	2023-07-22 04:10:18 -04:00
Wing Lian	3ffb018a4c	Merge pull request #313 from OpenAccess-AI-Collective/tokenizer-llama2-embeddings don't resize embeddings to multiples of 32x by default	2023-07-22 04:09:59 -04:00
Wing Lian	a94f2eecb1	Merge pull request #299 from OpenAccess-AI-Collective/flash-attention-2 Flash attention 2	2023-07-22 04:07:48 -04:00
Wing Lian	1066751358	don't resize embeddings to multiples of 32x by default	2023-07-22 01:52:38 -04:00
Wing Lian	1b63bf13bc	Merge pull request #308 from OpenAccess-AI-Collective/apache2-license add apache 2.0 license	2023-07-21 09:50:14 -04:00
Wing Lian	5cce2a42ff	add apache 2.0 license	2023-07-21 09:49:29 -04:00
Wing Lian	2a428e8014	better handling since xgen tokenizer breaks with convert_tokens_to_ids	2023-07-21 09:24:11 -04:00
Wing Lian	cdf85fdbd5	pin flash attention 2 to the fix for backwards pass	2023-07-21 08:18:53 -04:00
Wing Lian	9b790d359b	flash attention 2	2023-07-21 08:17:46 -04:00
Ethan Smith	38811434e6	Add XGen info to README and example config	2023-07-21 00:44:50 -07:00
NanoCode012	06c61d6f13	Merge pull request #304 from OpenAccess-AI-Collective/NanoCode012-patch-1 Fix(readme): Improve wording for push model	2023-07-21 13:39:45 +09:00
Wing Lian	262dc29df2	Merge pull request #300 from OpenAccess-AI-Collective/pytorch-201 Pytorch 2.0.1	2023-07-21 00:28:38 -04:00
NanoCode012	165907fddb	Fix(readme): Improve wording for push model	2023-07-21 11:28:35 +09:00
Wing Lian	a032c9f452	fix sdp attention to use the flash/mem-efficient context manaager	2023-07-20 01:05:48 -04:00
Wing Lian	b06d3e3645	explicitly pin flash attention 1 to v1.0.9	2023-07-20 01:02:08 -04:00
Wing Lian	c58034d48c	use pytorch 2.0.1	2023-07-20 00:47:13 -04:00
NanoCode012	28fd429bcf	Merge pull request #293 from NanoCode012/fix/tokenize-speed Fix(tokenizing): Use multi-core	2023-07-19 11:02:04 +09:00
NanoCode012	45ac7c4f88	feat: use multi-core	2023-07-19 10:16:54 +09:00
Wing Lian	edd6980dd9	Merge pull request #289 from OpenAccess-AI-Collective/hf_transfer add hf_transfer to requirements for faster hf upload	2023-07-17 15:08:06 -04:00
Wing Lian	dc6d25124d	Merge pull request #288 from OpenAccess-AI-Collective/NanoCode012-patch-1 fix(readme): remove accelerate config	2023-07-17 14:46:43 -04:00
Wing Lian	6dd2e7d671	add hf_transfer to requirements for faster hf upload	2023-07-17 14:44:48 -04:00
NanoCode012	b64f411849	fix(readme): remove accelerate config	2023-07-18 01:31:02 +09:00
Wing Lian	03a59c1ed4	Merge pull request #287 from OpenAccess-AI-Collective/dataclass-fix fix axolotl training args dataclass annotation	2023-07-17 06:09:23 -04:00
Wing Lian	ebaec3c406	fix axolotl training args dataclass annotation	2023-07-17 04:57:02 -04:00

1 2 3 4 5 ...

729 Commits