axolotl

Author	SHA1	Message	Date
Dan Saunders	f776f889a1	adding codecov reporting (#2372 ) [skip ci] * adding codecov reporting * update codecov-action to v5 * fix --------- Co-authored-by: Dan Saunders <dan@axolotl.ai>	2025-04-16 15:02:17 -07:00
NanoCode012	51267ded04	chore: update doc links (#2509 ) * chore: update doc links * fix: address pr feedback	2025-04-11 09:53:18 -04:00
Dan Saunders	113e9cd193	Autodoc generation with quartodoc (#2419 ) * quartodoc integration * quartodoc progress * deletions * Update docs/.gitignore to exclude auto-generated API documentation files * Fix * more autodoc progress * moving reference up near the top of the sidebar * fix broken link * update to reflect recent changes * pydantic models refactor + add to autodoc + fixes * fix * shrinking header sizes * fix accidental change * include quartodoc build step * update pre-commit version * update pylint * pre-commit --------- Co-authored-by: Dan Saunders <dan@axolotl.ai>	2025-03-21 12:26:47 -04:00
Wing Lian	aae4337f40	add 12.8.1 cuda to the base matrix (#2426 ) * add 12.8.1 cuda to the base matrix * use nightly * bump deepspeed and set no binary * deepspeed binary fixes hopefully * install deepspeed by itself * multiline fix * make sure ninja is installed * try with reversion of packaging/setuptools/wheel install * use license instead of license-file * try rolling back packaging and setuptools versions * comment out license for validation for now * make sure packaging version is consistent * more parity across tests and docker images for packaging/setuptools	2025-03-21 10:17:25 -04:00
SicariusSicariiStuff	85147ec430	Update README.md (#2360 ) * Update README.md wheel is needed * feat: add ninja, setuptools, packing to installation steps * fix: add missing instruction --------- Co-authored-by: NanoCode012 <nano@axolotl.ai>	2025-03-17 08:39:17 -04:00
NanoCode012	8e30917440	chore(docs): remove phorm (#2378 ) [skip ci]	2025-03-05 10:00:50 -05:00
NanoCode012	2efe1b4c09	Feat(doc): Reorganize documentation, fix broken syntax, update notes (#2348 ) * feat(doc): organize docs, add to menu bar, fix broken formatting * feat: add link to custom integrations * feat: update readme for integrations to include citations and repo link * chore: update lm_eval info * chore: use fullname * Update docs/cli.qmd per suggestion Co-authored-by: Dan Saunders <danjsaund@gmail.com> * feat: add sweep doc * feat: add kd doc * fix: remove toc * fix: update deprecation * feat: add more info about chat_template issues * fix: heading level * fix: shell->bash code block * fix: ray link * fix(doc): heading level, header links, formatting * feat: add grpo docs * feat: add style changes * fix: wrong cli arg for lm-eval * fix: remove old run method * feat: load custom integration doc dynamically * fix: remove old cli way * fix: toc * fix: minor formatting --------- Co-authored-by: Dan Saunders <danjsaund@gmail.com>	2025-02-25 16:09:37 +07:00
NanoCode012	fd8cb32547	chore: remove redundant py310 from tests (#2316 )	2025-02-07 21:34:16 -05:00
Dan Saunders	6f294c3d8d	refactor README; hardcode links to quarto docs; add additional quarto doc pages (#2295 ) * refactor README; hardcode links to quarto docs; add additional quarto doc pages * updates * review comments * update --------- Co-authored-by: Dan Saunders <dan@axolotl.ai>	2025-01-30 12:49:21 -05:00
Wing Lian	8779997ba5	native support for modal cloud from CLI (#2237 ) * native support for modal cloud from CLI * do lm_eval in cloud too * Fix the sub call to lm-eval * lm_eval option to not post eval, and append not extend * cache bust when using branch, grab sha of latest image tag, update lm-eval dep * allow minimal yaml for lm eval * include modal in requirements * update link in README to include utm * pr feedback * use chat template * revision support * apply chat template as arg * add wandb name support, allow explicit a100-40gb * cloud is optional * handle accidental setting of tasks with a single task str * document the modal cloud yaml for clarity [skip ci] * cli docs * support spawn vs remote for lm-eval * Add support for additional docker commands in modal image build * cloud config shouldn't be a dir * Update README.md Co-authored-by: Charles Frye <cfrye59@gmail.com> * fix annotation args --------- Co-authored-by: Charles Frye <cfrye59@gmail.com>	2025-01-30 11:34:02 -05:00
salman	c071a530f7	removing 2.3.1 (#2294 )	2025-01-28 23:23:44 -05:00
NanoCode012	74f9782fc3	chore(doc): fix explanation on gcs creds retrieval (#2272 )	2025-01-24 10:05:58 -05:00
Wing Lian	d009ead101	fix build w pyproject to respect insalled torch version (#2168 ) * fix build w pyproject to respect insalled torch version * include in manifest * disable duplicate code check for now * move parser so it can be found * add checks for correct pytorch version so this doesn't slip by again	2024-12-10 16:25:25 -05:00
Wing Lian	34d3c8dcfb	[docs] Update README Quickstart to use CLI (#2137 ) * update quickstart for new CLI * add blurb about bleeding edge builds * missed a yaml reference * prefer lora over qlora for examples * fix commands for parity with previous instructions * consistency on pip/pip3 install * one more parity pip=>pip3 * remove extraneous options in example yaml Co-authored-by: NanoCode012 <nano@axolotl.ai> * update copy * update badges and for discord and socials in readme * Fix a few broken links * bump version to 0.6.0 for release --------- Co-authored-by: NanoCode012 <nano@axolotl.ai>	2024-12-09 14:03:19 -05:00
Dan Saunders	fc973f4322	CLI Implementation with Click (#2107 ) * Initial CLI implementation with click package * Adding fetch command for pulling examples and deepspeed configs * Automating default options for CliArgs classes * Mimicking existing no config behavior * bugfix in choose_config * Updating fetch to sync instead of re-download * bugfix * isort fix * fixing yaml isort order * pre-commit fixes * simplifying argument parsing -- pass through kwargs to do_cli * make accelerate launch default for non-preprocess commands * fixing arg handling * testing None placeholder approach * removing hacky --use-gpu argument to preprocess command * Adding brief README documentation for CLI * remove (New) * Initial CLI pytest tests * progress on CLI pytest * adding inference CLI tests; cleanup * Refactor train CLI tests to remove various mocking * Major CLI test refator; adding remaining CLI codepath test coverage * pytest fixes * remove integration markers * parallelizing examples, deepspeed config downloads; rename test to match other CLI test naming * moving cli pytest due to isolation issues; cleanup * testing fixes; various minor improvements * fix * tests fix * Update tests/cli/conftest.py Co-authored-by: Wing Lian <wing.lian@gmail.com> --------- Co-authored-by: Dan Saunders <dan@axolotl.ai> Co-authored-by: Wing Lian <wing.lian@gmail.com>	2024-12-05 22:11:48 -05:00
Wing Lian	4baf8e5e96	cleanup the readme, add Modal as sponsor (#2130 ) [skip ci]	2024-12-05 21:19:52 -05:00
NanoCode012	81ef3e45f7	fix(readme): update cuda instructions during preprocess (#2114 ) [skip ci]	2024-12-03 08:58:03 -05:00
Wing Lian	c07bd2fa65	Readme updates v2 (#2078 ) * update readme logos * use full logo * Fix svgs * add srcset * resize svgs to match * Rename file * align badges center	2024-11-18 14:58:03 -05:00
Wing Lian	ed079d434a	static assets, readme, and badges update v1 (#2077 )	2024-11-18 13:59:32 -05:00
Wing Lian	234e94e9dd	replace references to personal docker hub to org docker hub (#2036 ) [skip ci]	2024-11-11 15:09:29 -05:00
Wing Lian	fd3b80716a	remove fastchat and sharegpt (#2021 ) * remove fastchat and sharegpt * remove imports * remove more fastchat imports * chore: remove unused functions * feat: remove sharegpt and deprecate from docs * chore: remove unused sharegpt checks * fix: remove sharegpt type from tests * feat: add sharegpt deprecation error * feat: update readme --------- Co-authored-by: NanoCode012 <nano@axolotl.ai>	2024-11-08 13:45:49 -05:00
Wing Lian	02ce520b7e	upgrade liger to 0.4.0 (#1973 ) * upgrade liger to 0.3.1 * update docs and example * skip duplicate code check * Update src/axolotl/integrations/liger/args.py Co-authored-by: NanoCode012 <nano@axolotl.ai> * Update README.md Co-authored-by: NanoCode012 <nano@axolotl.ai> * add logging * chore: lint * add test case * upgrade liger and transformers * also upgrade accelerate * use kwargs to support patch release * make sure prepared path is empty for test * use transfromers 4.46.1 since 4.46.2 breaks fsdp --------- Co-authored-by: NanoCode012 <nano@axolotl.ai>	2024-11-07 12:53:34 -05:00
Oliver Kunc	107b67b852	Hardware requirements (#1997 ) [skip ci] * Hardware requirements https://github.com/axolotl-ai-cloud/axolotl/issues/1992 * Update README.md --------- Co-authored-by: Wing Lian <wing.lian@gmail.com>	2024-10-29 10:13:50 -04:00
NanoCode012	bfc77b0f36	Feat: Add support for tokenizer’s or custom jinja chat_template (#1970 ) * Allow using tokenizer's default chat template with fallbacks Summary of changes: 1. Adds `tokenizer_default` as option for `chat_template` in `chat_template` prompt strategy that allows using the chat template from tokenizer's config.json 2. Allows falling back to chat templates available in axolotl if tokenizer does not have a chat template 3. Adds a mistral chat template which supports system message - taken from https://github.com/chujiezheng/chat_templates/blob/main/chat_templates/mistral-instruct.jinja --- Why? Many popular models are not trained with chatml format. As a result for the model to correctly learn chatml we have to turn on train_on_inputs which requires more compute and time. If we can use the model's already learned chat template we can just learn the output tokens --- Todo: - Write tests * Add tests * Fix lint and bug post merge from main * Add option `chat_template_jinja` to provide a jinja template * remove custom mistral template * Address review comments and add docs * Update docs/dataset-formats/conversation.qmd Co-authored-by: NanoCode012 <kevinvong@rocketmail.com> * fix: set default to tokenizer template * Merge branch 'main' into cj_tokenizer_default_prompt_template * chore: remove redundant function * fix: re-arrange enum declaration position * fix: refactor artifact left from main merge * feat(doc): updated config with chat template options and clarified examples * chore: clarify doc * chore: added example for non-default template * chore: refactor * fix: test * fix: config being dropped and unittest to catch that * chore: lint * chore: skip duplicate * fix: rename var after merge * feat: add test for levy's dpo case * fix: remove default setting on edge case where chat template overriden in dataset section * feat: handle sharegpt deprecation better in docs * feat: add example using fallback * feat: handles chat_template requiring specific user/assistant order * fix: update test based on new defaults * fix: imported name incorrectly updated on merge * chore: lint * fix: update dummy message to prevent potential overlap with real content * fix(doc): formatting * fix: update bradleyterry to use new chat_template --------- Co-authored-by: Chirag Jain <jain.chirag925@gmail.com>	2024-10-29 10:14:51 +07:00
Wing Lian	76883851d2	add warning that sharegpt will be deprecated (#1957 ) * add warning that sharegpt will be deprecated * add helper script for chat_templates and document deprecation * Update src/axolotl/prompt_strategies/sharegpt.py Co-authored-by: NanoCode012 <nano@axolotl.ai> --------- Co-authored-by: NanoCode012 <nano@axolotl.ai>	2024-10-11 13:33:20 -04:00
Boris Feld	6d3caadf90	Comet integration (#1939 ) * Add first version of a Comet integration * Remove debug prints * Add test for Comet Configuration transformation to env variables * Fix last lint warning * Update Readme for Comet logging documentation * Update Comet integration to be optional, update code and tests * Add documentation for Comet configuration * Add missing check	2024-10-09 16:03:37 -04:00
Byron Hsu	e3a38450de	Add liger kernel to features (#1881 ) [skip ci]	2024-08-29 08:19:18 -04:00
Wing Lian	810ecd4e81	add liger to readme (#1865 ) * add liger to readme * updates from PR feedback	2024-08-23 14:34:03 -04:00
Wing Lian	b33dc07a77	rename nightly test and add badge (#1853 )	2024-08-22 13:13:33 -04:00
Wing Lian	dcbff16983	run nightly ci builds against upstream main (#1851 ) * run nightly ci builds against upstream main * add test badges * run the multigpu tests against nightly main builds too	2024-08-22 13:10:54 -04:00
Gal Cohen (galco)	957c956f89	rename jamba example (#1846 ) [skip ci] * rename jamba example * feat: change readme --------- Co-authored-by: Gal Cohen <galc@ai21.com>	2024-08-22 09:22:55 -04:00
mhenrichsen	3bc8e64557	Update README.md (#1792 )	2024-07-30 07:59:53 +02:00
Wing Lian	7830fe04b5	Unsloth rope (#1767 ) * Add unsloth rope embeddings support * support for models weights in 4bit and do some memory gc * use accelerate logger * add unsloth llama rms norm optims * update docs for unsloth * more docs info	2024-07-18 14:54:41 -04:00
David Meikle	634f384e06	Changed URL for dataset docs (#1744 )	2024-07-13 14:34:28 -04:00
mhenrichsen	1194c2e0b1	github urls (#1734 ) Co-authored-by: Henrichsen, Mads (ext) <mads.henrichsen.ext@siemens-energy.com>	2024-07-11 09:19:29 -04:00
Saeed Esmaili	5cde06587a	Fix the broken link in README (#1678 ) [skip ci]	2024-06-03 09:38:44 -04:00
Abe Voelker	49b967b62f	Fix README quick start example usage model dirs (#1668 )	2024-05-28 18:10:40 -04:00
Wing Lian	3319780300	update torch 2.2.1 -> 2.2.2 (#1622 )	2024-05-15 09:45:27 -04:00
Chansung Park	5d97e65f95	add dstack section (#1612 ) [skip ci] * add dstack section * chore: lint --------- Co-authored-by: Wing Lian <wing.lian@gmail.com>	2024-05-14 08:13:45 -04:00
Atlas	bcaa92325d	Update Readme to include support for Mixtral8X22B (#1518 ) [skip ci]	2024-04-17 01:15:30 -04:00
YTING	7d9bafcb88	Update README.md (#1521 ) [skip ci]	2024-04-17 01:15:05 -04:00
Wing Lian	e07dcb288c	add docs around pre-processing (#1529 )	2024-04-16 19:45:46 -04:00
NanoCode012	c2b64e4dcf	Feat: update doc (#1475 ) [skip ci] * feat: update doc contents * chore: move batch vs ga docs * feat: update lambdalabs instructions * fix: refactor dev instructions	2024-04-04 13:43:40 +09:00
James Melvin Ebenezer	cae608f587	Added pip install ninja to accelerate installation of flash-attn (#1461 ) * Added pip install ninja to accelerate installation of flash-attn * doc: cleanup	2024-04-02 17:36:41 +09:00
Hamel Husain	86b7d22f35	Reorganize Docs (#1468 )	2024-04-01 08:00:52 -07:00
Phuc Van Phan	324d59ea0d	docs: update link to docs of advance topic in README.md (#1437 )	2024-03-24 21:49:27 -07:00
NanoCode012	f1ebaa07c6	chore(config): refactor old mistral config (#1435 ) * chore(config): refactor old mistral config * chore: add link to colab on readme	2024-03-25 12:00:44 +09:00
Hamel Husain	629450cecd	Bootstrap Hosted Axolotl Docs w/Quarto (#1429 ) * precommit * mv styes.css * fix links	2024-03-21 22:28:36 -07:00
Wing Lian	dd449c5cd8	support galore once upstreamed into transformers (#1409 ) * support galore once upstreamed into transformers * update module name for llama in readme and fix typing for all linear * bump trl for deprecation fixes from newer transformers * include galore as an extra and install in docker image * fix optim_args type * fix optim_args * update dependencies for galore * add galore to cicd dockerfile	2024-03-19 09:26:35 -04:00
NanoCode012	40a88e8c4a	Feat: Add sharegpt multirole (#1137 ) * feat(prompt): support multiple roles for sharegpt * fix: add handling of empty role back * feat: rebased and allowed more dynamic roles via config * fix: variable * chore: update message * feat: add vicuna format * fix: JSON serializable error * fix: typing * fix: don't remap for unknown keys * fix: add roles to pydantic * feat: add test * chore: remove leftover print * chore: remove leftover comment * chore: remove print * fix: update test to use chatml	2024-03-19 20:51:49 +09:00

1 2 3 4 5 ...

340 Commits