axolotl

Author	SHA1	Message	Date
Wing Lian	e7d3e2dbb6	use fastchat conversations template (#578 ) * use fastchat conversations template * require fastchat (fschat) pip install * handle roles dynamically from conversation * tweak fastchat conversation with a monkeypatch to get individual turns * fix up so it works with multiple conversation styles, and don't strip the turns * fix sharegpt fixture now that we're using a more correct tokenization * use a new prompter and support fastchat conversation type * use sharegpt from prompt strategies now * update docs, add chatml template * add a newline after im_end token * ensure we correctly set system message * update per PR feedback to handle deprecated sharegpt types * don't add duplicate wandb req * make sharegpt fields configurable from yml * llama2 fixes * don't fail fatally when turns are improper	2023-09-27 12:10:45 -04:00
Wing Lian	60c7c48c97	update for recent transformers updates (#636 ) * update for recent transformers updates * fix checkpoint forward kwargs * just pass args into torch checkpoint	2023-09-27 12:10:32 -04:00
Wing Lian	e8cbf50be6	attention_mask not needed for training (#642 ) * attention_mask not needed for training * specifically don't use attention mask for phi * use a different check for phi * small fixes since phi removed some values from their config	2023-09-27 11:12:08 -04:00
Wing Lian	d887ad86c3	eval_table isn't quite stable enough to be in default llama configs (#637 )	2023-09-26 10:13:20 -04:00
NanoCode012	19a600a8b8	Feat: Add support for upstream FA2 (#626 ) * Feat: Add support for upstream FA2 * chore: add is_falcon_derived_model: true to examples * chore: add config to readme for documentation * feat: add extra model types * fix: remove old falcon flash patch * chore: pin transformers and accelerate	2023-09-26 09:53:28 -04:00
Fernando Tarin Morales	5e5296a77c	Added quotes to the pip install -e command to fix an incompatibility with shells that do glob expansion like zsh (#632 )	2023-09-25 11:50:14 -04:00
mhenrichsen	f3d939016a	Merge pull request #629 from OpenAccess-AI-Collective/chore/-change-default-model default model changed	2023-09-25 09:32:01 +02:00
NanoCode012	cfbce020e9	Fix: Fail bf16 check when running on cpu during merge (#631 )	2023-09-25 13:48:18 +09:00
mhenrichsen	4fecbfe5e1	default model changed	2023-09-24 18:52:53 +02:00
NanoCode012	67b9888630	Feat(doc): Add eval_sample_packing to doc (#625 )	2023-09-23 13:11:27 +09:00
Maxime	923eb91304	tweak: improve base builder for smaller layers (#500 )	2023-09-22 16:17:50 -04:00
Wing Lian	a363604dcf	better handling and logging of empty sharegpt turns (#603 )	2023-09-22 16:13:42 -04:00
Wing Lian	501958bb6f	create a model card with axolotl badge (#624 )	2023-09-22 16:13:26 -04:00
Wing Lian	c25ba7939b	update README w deepspeed info (#605 )	2023-09-22 00:15:52 -04:00
NanoCode012	d5f8589021	chore(callback): Remove old peft saving code (#510 )	2023-09-22 12:31:33 +09:00
Wing Lian	03e59077a0	misc fixes to add gptq tests (#621 ) * misc fixes to add gptq tests * set bf16 needed for fa2	2023-09-21 21:52:12 -04:00
Wing Lian	97d3776ce6	split completion text to sequence_len (#616 )	2023-09-21 21:51:25 -04:00
Wing Lian	2844eb22b6	run eval on the first step to get a baseline (#617 ) * run eval on the first step to get a baseline * wandb kleeps getting moved around by pre-commit ...	2023-09-21 21:51:09 -04:00
Wing Lian	e85d2eb06b	let MAX_JOBS use the default since we're not resource constrained on our self-hosted runners (#427 )	2023-09-21 20:36:30 -04:00
Wing Lian	196ff1181e	skip the gpu memory checks if the device is set to 'auto' (#609 ) * skip the gpu memory checks if the device is set to 'auto' * skip gpu mem logging if cpu too * don't worry about log_gpu_memory_usage since it calls another annotated fn * rename decorator internal	2023-09-21 15:20:31 -04:00
Wing Lian	92512c390b	ignore wandb to resolve isort headaches (#619 )	2023-09-21 11:50:09 -04:00
Maxime	2fe95cdcc1	fix distributed devices (#612 ) * fix distributed devices * Update distributed.py * Update distributed.py	2023-09-21 09:11:34 -04:00
Maxime	c1382e79b6	Create multi-node.md (#613 ) * Create multi-node.md * Update multi-node.md * Update multi-node.md	2023-09-20 22:02:16 -04:00
Maxime	5d931cc042	Only run tests when a change to python files is made (#614 ) * Update tests.yml * Update .github/workflows/tests.yml --------- Co-authored-by: Wing Lian <wing.lian@gmail.com>	2023-09-20 22:02:04 -04:00
Javier	ec0958f4f8	Update requirements.txt (#610 )	2023-09-20 08:40:49 -04:00
Wing Lian	faecff9798	support to disable exllama for gptq (#604 ) * support to disable exllama for gptq * update property instead of item * fix config key	2023-09-19 17:51:08 -04:00
bofeng huang	aa656e04bd	Delete duplicate lines (#606 )	2023-09-19 16:40:05 -04:00
Wing Lian	b53e77775b	update dockerfile to not build evoformer since it fails the build (#607 )	2023-09-19 16:28:29 -04:00
Wing Lian	674c57692d	more sane defaults for openllama 3b used for quickstarts (#602 ) * more sane defaults for openllama 3b used for quickstarts * don't use bf16 for quickstart to simplify gpu compatibility * use the update openlm-research/open_llama_3b_v2 models	2023-09-19 09:15:10 -04:00
Wing Lian	1eebbd09c3	improve handling for empty text on the tokenization step (#502 )	2023-09-19 08:09:56 -04:00
Wing Lian	62a774140b	Fix for check with cfg and merge_lora (#600 )	2023-09-18 21:14:32 -04:00
Wing Lian	31b9e0c6e8	minor tweaks to simplify (#597 )	2023-09-18 11:45:44 -04:00
Wing Lian	6b9b229356	btlm and falcon monkey patches for flash attn (#566 )	2023-09-17 13:49:18 -04:00
Wing Lian	131afdbd89	add bf16 check (#587 )	2023-09-17 13:49:03 -04:00
NanoCode012	00dce35fb2	Feat(data): Allow loading local csv and text (#594 ) * Feat(data): Allow loading local csv and text * chore: update readme for loading data	2023-09-17 11:32:27 -04:00
Wing Lian	b15b19eb8d	gather/broadcast the max value of the packing efficiency automatically (#463 )	2023-09-17 11:08:18 -04:00
Wing Lian	ab534d75ba	don't add position_ids for evals (#591 )	2023-09-16 16:11:57 -04:00
Wing Lian	21ec195c9f	optionally configure sample packing for evals (#589 )	2023-09-16 00:09:48 -04:00
Wing Lian	62eaee7649	make phi training work with Loras (#588 ) * valdiation for phi loras * fix model config class check * update readme for phi traiing	2023-09-15 20:51:55 -04:00
Jan Philipp Harries	be75668400	set fsdp state dict (#584 ) Co-authored-by: Jan Philipp Harries <jphme@users.noreply.github.com>	2023-09-15 17:47:36 -04:00
Wing Lian	aeec7c4688	pop block_cls since it's not an actual kwarg	2023-09-15 15:54:06 -04:00
Wing Lian	360788296a	don't resize embeddings if it's already large enough (#577 ) * don't resize embeddings if it's already large enough * make sure to tie weights, even if we aren't resizing	2023-09-15 15:47:09 -04:00
Wing Lian	12a2dbbc2c	Support Sample packing for phi arch (#586 ) * phi sequence packing * sample packing fixes * fix linting * fix inference and phi e2e tests * update phi example now that sample packing works * wandb import keeps getting moved around	2023-09-15 15:46:54 -04:00
NanoCode012	3a2edc85c3	Feat(doc): Add features to doc (#583 )	2023-09-16 01:14:15 +09:00
Wing Lian	f7a22632d7	support custom field for completion from yml (#580 ) * support custom field for completion from yml * remove legacy completion check and add doc * update README docs	2023-09-15 07:48:21 -04:00
Doan Minh Phuong	1aa400721e	Fix Codellama examples (#582 ) * Fix seq_len * Update lora.yml * Update qlora.yml * Update lora.yml * Update lora.yml * Update qlora.yml	2023-09-15 04:19:13 -04:00
Wing Lian	8dcd40ac78	prevent cli functions from getting fired on import (#581 )	2023-09-15 04:03:32 -04:00
Wing Lian	a5a625f47e	update support matrix with btlm and phi (#579 )	2023-09-15 02:46:15 -04:00
Wing Lian	861cecac2a	refactor scripts/finetune.py into new cli modules (#550 ) * refactor scripts/finetune.py into new cli modules * continue to support scripts/finetune.py * update readme with updated cli commands * Update scripts/finetune.py Co-authored-by: NanoCode012 <kevinvong@rocketmail.com> --------- Co-authored-by: NanoCode012 <kevinvong@rocketmail.com>	2023-09-15 01:43:52 -04:00
Wing Lian	1078d3eae7	E2e passing tests (#576 ) * run e2e tests after all other checks have passed * tweak tests so they get run on PRs or push to main * change dependent action for chcecking * one test workflow to rule them all * no need for custom action, just use needs * whoops, python version should be a string * e2e tests can run on any available gpu	2023-09-15 01:03:49 -04:00

1 2 3 4 5 ...

957 Commits