axolotl

Author	SHA1	Message	Date
NanoCode012	1969fa3bf0	fix(readme): update cuda instructions during preprocess (#2114 ) [skip ci]	2024-12-04 12:33:29 -05:00
Wing Lian	c07bd2fa65	Readme updates v2 (#2078 ) * update readme logos * use full logo * Fix svgs * add srcset * resize svgs to match * Rename file * align badges center	2024-11-18 14:58:03 -05:00
Wing Lian	ed079d434a	static assets, readme, and badges update v1 (#2077 )	2024-11-18 13:59:32 -05:00
Wing Lian	234e94e9dd	replace references to personal docker hub to org docker hub (#2036 ) [skip ci]	2024-11-11 15:09:29 -05:00
Wing Lian	fd3b80716a	remove fastchat and sharegpt (#2021 ) * remove fastchat and sharegpt * remove imports * remove more fastchat imports * chore: remove unused functions * feat: remove sharegpt and deprecate from docs * chore: remove unused sharegpt checks * fix: remove sharegpt type from tests * feat: add sharegpt deprecation error * feat: update readme --------- Co-authored-by: NanoCode012 <nano@axolotl.ai>	2024-11-08 13:45:49 -05:00
Wing Lian	02ce520b7e	upgrade liger to 0.4.0 (#1973 ) * upgrade liger to 0.3.1 * update docs and example * skip duplicate code check * Update src/axolotl/integrations/liger/args.py Co-authored-by: NanoCode012 <nano@axolotl.ai> * Update README.md Co-authored-by: NanoCode012 <nano@axolotl.ai> * add logging * chore: lint * add test case * upgrade liger and transformers * also upgrade accelerate * use kwargs to support patch release * make sure prepared path is empty for test * use transfromers 4.46.1 since 4.46.2 breaks fsdp --------- Co-authored-by: NanoCode012 <nano@axolotl.ai>	2024-11-07 12:53:34 -05:00
Oliver Kunc	107b67b852	Hardware requirements (#1997 ) [skip ci] * Hardware requirements https://github.com/axolotl-ai-cloud/axolotl/issues/1992 * Update README.md --------- Co-authored-by: Wing Lian <wing.lian@gmail.com>	2024-10-29 10:13:50 -04:00
NanoCode012	bfc77b0f36	Feat: Add support for tokenizer’s or custom jinja chat_template (#1970 ) * Allow using tokenizer's default chat template with fallbacks Summary of changes: 1. Adds `tokenizer_default` as option for `chat_template` in `chat_template` prompt strategy that allows using the chat template from tokenizer's config.json 2. Allows falling back to chat templates available in axolotl if tokenizer does not have a chat template 3. Adds a mistral chat template which supports system message - taken from https://github.com/chujiezheng/chat_templates/blob/main/chat_templates/mistral-instruct.jinja --- Why? Many popular models are not trained with chatml format. As a result for the model to correctly learn chatml we have to turn on train_on_inputs which requires more compute and time. If we can use the model's already learned chat template we can just learn the output tokens --- Todo: - Write tests * Add tests * Fix lint and bug post merge from main * Add option `chat_template_jinja` to provide a jinja template * remove custom mistral template * Address review comments and add docs * Update docs/dataset-formats/conversation.qmd Co-authored-by: NanoCode012 <kevinvong@rocketmail.com> * fix: set default to tokenizer template * Merge branch 'main' into cj_tokenizer_default_prompt_template * chore: remove redundant function * fix: re-arrange enum declaration position * fix: refactor artifact left from main merge * feat(doc): updated config with chat template options and clarified examples * chore: clarify doc * chore: added example for non-default template * chore: refactor * fix: test * fix: config being dropped and unittest to catch that * chore: lint * chore: skip duplicate * fix: rename var after merge * feat: add test for levy's dpo case * fix: remove default setting on edge case where chat template overriden in dataset section * feat: handle sharegpt deprecation better in docs * feat: add example using fallback * feat: handles chat_template requiring specific user/assistant order * fix: update test based on new defaults * fix: imported name incorrectly updated on merge * chore: lint * fix: update dummy message to prevent potential overlap with real content * fix(doc): formatting * fix: update bradleyterry to use new chat_template --------- Co-authored-by: Chirag Jain <jain.chirag925@gmail.com>	2024-10-29 10:14:51 +07:00
Wing Lian	76883851d2	add warning that sharegpt will be deprecated (#1957 ) * add warning that sharegpt will be deprecated * add helper script for chat_templates and document deprecation * Update src/axolotl/prompt_strategies/sharegpt.py Co-authored-by: NanoCode012 <nano@axolotl.ai> --------- Co-authored-by: NanoCode012 <nano@axolotl.ai>	2024-10-11 13:33:20 -04:00
Boris Feld	6d3caadf90	Comet integration (#1939 ) * Add first version of a Comet integration * Remove debug prints * Add test for Comet Configuration transformation to env variables * Fix last lint warning * Update Readme for Comet logging documentation * Update Comet integration to be optional, update code and tests * Add documentation for Comet configuration * Add missing check	2024-10-09 16:03:37 -04:00
Byron Hsu	e3a38450de	Add liger kernel to features (#1881 ) [skip ci]	2024-08-29 08:19:18 -04:00
Wing Lian	810ecd4e81	add liger to readme (#1865 ) * add liger to readme * updates from PR feedback	2024-08-23 14:34:03 -04:00
Wing Lian	b33dc07a77	rename nightly test and add badge (#1853 )	2024-08-22 13:13:33 -04:00
Wing Lian	dcbff16983	run nightly ci builds against upstream main (#1851 ) * run nightly ci builds against upstream main * add test badges * run the multigpu tests against nightly main builds too	2024-08-22 13:10:54 -04:00
Gal Cohen (galco)	957c956f89	rename jamba example (#1846 ) [skip ci] * rename jamba example * feat: change readme --------- Co-authored-by: Gal Cohen <galc@ai21.com>	2024-08-22 09:22:55 -04:00
mhenrichsen	3bc8e64557	Update README.md (#1792 )	2024-07-30 07:59:53 +02:00
Wing Lian	7830fe04b5	Unsloth rope (#1767 ) * Add unsloth rope embeddings support * support for models weights in 4bit and do some memory gc * use accelerate logger * add unsloth llama rms norm optims * update docs for unsloth * more docs info	2024-07-18 14:54:41 -04:00
David Meikle	634f384e06	Changed URL for dataset docs (#1744 )	2024-07-13 14:34:28 -04:00
mhenrichsen	1194c2e0b1	github urls (#1734 ) Co-authored-by: Henrichsen, Mads (ext) <mads.henrichsen.ext@siemens-energy.com>	2024-07-11 09:19:29 -04:00
Saeed Esmaili	5cde06587a	Fix the broken link in README (#1678 ) [skip ci]	2024-06-03 09:38:44 -04:00
Abe Voelker	49b967b62f	Fix README quick start example usage model dirs (#1668 )	2024-05-28 18:10:40 -04:00
Wing Lian	3319780300	update torch 2.2.1 -> 2.2.2 (#1622 )	2024-05-15 09:45:27 -04:00
Chansung Park	5d97e65f95	add dstack section (#1612 ) [skip ci] * add dstack section * chore: lint --------- Co-authored-by: Wing Lian <wing.lian@gmail.com>	2024-05-14 08:13:45 -04:00
Atlas	bcaa92325d	Update Readme to include support for Mixtral8X22B (#1518 ) [skip ci]	2024-04-17 01:15:30 -04:00
YTING	7d9bafcb88	Update README.md (#1521 ) [skip ci]	2024-04-17 01:15:05 -04:00
Wing Lian	e07dcb288c	add docs around pre-processing (#1529 )	2024-04-16 19:45:46 -04:00
NanoCode012	c2b64e4dcf	Feat: update doc (#1475 ) [skip ci] * feat: update doc contents * chore: move batch vs ga docs * feat: update lambdalabs instructions * fix: refactor dev instructions	2024-04-04 13:43:40 +09:00
James Melvin Ebenezer	cae608f587	Added pip install ninja to accelerate installation of flash-attn (#1461 ) * Added pip install ninja to accelerate installation of flash-attn * doc: cleanup	2024-04-02 17:36:41 +09:00
Hamel Husain	86b7d22f35	Reorganize Docs (#1468 )	2024-04-01 08:00:52 -07:00
Phuc Van Phan	324d59ea0d	docs: update link to docs of advance topic in README.md (#1437 )	2024-03-24 21:49:27 -07:00
NanoCode012	f1ebaa07c6	chore(config): refactor old mistral config (#1435 ) * chore(config): refactor old mistral config * chore: add link to colab on readme	2024-03-25 12:00:44 +09:00
Hamel Husain	629450cecd	Bootstrap Hosted Axolotl Docs w/Quarto (#1429 ) * precommit * mv styes.css * fix links	2024-03-21 22:28:36 -07:00
Wing Lian	dd449c5cd8	support galore once upstreamed into transformers (#1409 ) * support galore once upstreamed into transformers * update module name for llama in readme and fix typing for all linear * bump trl for deprecation fixes from newer transformers * include galore as an extra and install in docker image * fix optim_args type * fix optim_args * update dependencies for galore * add galore to cicd dockerfile	2024-03-19 09:26:35 -04:00
NanoCode012	40a88e8c4a	Feat: Add sharegpt multirole (#1137 ) * feat(prompt): support multiple roles for sharegpt * fix: add handling of empty role back * feat: rebased and allowed more dynamic roles via config * fix: variable * chore: update message * feat: add vicuna format * fix: JSON serializable error * fix: typing * fix: don't remap for unknown keys * fix: add roles to pydantic * feat: add test * chore: remove leftover print * chore: remove leftover comment * chore: remove print * fix: update test to use chatml	2024-03-19 20:51:49 +09:00
Seungduk Kim	43bdc5d3de	Add a config not to shuffle merged dataset (#1394 ) [skip ci] * Add a config not to shuffle merged dataset * Update README.md * Update src/axolotl/utils/config/models/input/v0_4_1/__init__.py Co-authored-by: Wing Lian <wing.lian@gmail.com> * invert the condition name * update README * info -> debug --------- Co-authored-by: Wing Lian <wing.lian@gmail.com>	2024-03-19 20:51:00 +09:00
NanoCode012	b1e3e1b25f	fix(config): passing gradient_checkpoint_kwargs (#1412 ) * fix(config): change default use_reentrant to true * Update trainer_builder.py * fix: make sure to pass kwargs to enable checkpoint * chore: lint	2024-03-19 12:57:43 +09:00
jbl	e8c8ea64b3	Update README.md (#1418 ) Add Phorm AI Badge	2024-03-17 23:47:46 -04:00
NanoCode012	f083aed2c7	Fix(readme): Improve README QuickStart info (#1408 ) * Fix(readme): Improve README QuickStart info * chore: add to toc	2024-03-16 21:10:22 +09:00
NanoCode012	868c33954d	Feat(readme): Add instructions for Google GPU VM instances (#1410 )	2024-03-16 21:10:05 +09:00
Hamel Husain	8b12468230	Add QLoRA + FSDP Docs (#1403 ) * pre commit * Update fsdp_qlora.md	2024-03-14 11:04:51 -04:00
Wing Lian	638c2dafb5	JarvisLabs (#1372 ) * add Jarvis cloud gpu and sponsorship * whitespace	2024-03-07 10:47:32 -05:00
Hamel Husain	ed70a08348	add docs for `input_output` format (#1367 ) [skip ci] * add docs * add docs * run linter	2024-03-06 09:09:49 -05:00
Nicolas Rojas	37657473c8	Remove unsupported python version 3.9 from README (#1364 ) [skip ci]	2024-03-05 21:19:36 -05:00
Wing Lian	6b3b271925	fix for protected model_ namespace w pydantic (#1345 )	2024-02-28 15:07:49 -05:00
Maxime	0f6af36d50	Mps mistral lora (#1292 ) [skip ci] * Lora example for Mistral on MPS backend * Add some MPS documentation * Update examples/mistral/lora-mps.yml Co-authored-by: NanoCode012 <kevinvong@rocketmail.com> * Update examples/mistral/lora-mps.yml Co-authored-by: NanoCode012 <kevinvong@rocketmail.com> * Update README.md --------- Co-authored-by: NanoCode012 <kevinvong@rocketmail.com> Co-authored-by: Wing Lian <wing.lian@gmail.com>	2024-02-26 22:39:57 -05:00
JohanWork	d75653407c	ADD: push checkpoints to mlflow artifact registry (#1295 ) [skip ci] * Add checkpoint logging to mlflow artifact registry * clean up * Update README.md Co-authored-by: NanoCode012 <kevinvong@rocketmail.com> * update pydantic config from rebase --------- Co-authored-by: NanoCode012 <kevinvong@rocketmail.com> Co-authored-by: Wing Lian <wing.lian@gmail.com>	2024-02-26 13:32:39 -05:00
NanoCode012	c6b01e0f4a	chore: update readme to be more clear (#1326 ) [skip ci]	2024-02-26 13:32:13 -05:00
Wing Lian	cc3cebfa70	Pydantic 2.x cfg (#1239 ) * WIP conversion to use pydantic for config validation * wip, more fields, add capabilities * wip * update pydantic validation to match existing tests * tweak requirements * setup deprecated paams pydantic model * more validations * wrap up rest of the validations * flesh out the rest of the options from the readme into pydantic * fix model validators as class methods remember to return in validator missing return add missing relora attributes fix test for DictDefault change fix sys template for mistral from fastchat change in PR 2872 fix test for batch size warning * more missing attributes for cfg * updates from PR feedback * fix validation for datasets and pretrain datasets * fix test for lora check	2024-02-26 12:24:14 -05:00
NanoCode012	2ed52bd568	fix(readme): Clarify doc for tokenizer_config (#1323 ) [skip ci]	2024-02-24 21:55:04 +09:00
NanoCode012	3d2cd804ae	fix(readme): update inference md link (#1311 ) [skip ci]	2024-02-22 02:48:06 +09:00

1 2 3 4 5 ...

324 Commits