axolotl

Author	SHA1	Message	Date
NanoCode012	1178a15ede	Feat: Add qwen3 and CCE for qwen family (#2518 )	2025-04-28 12:18:46 -04:00
Wing Lian	53dbf97d85	make cce default to true when using the plugin (#2562 ) [skip ci]	2025-04-25 17:14:26 -04:00
NanoCode012	a6d28d19b1	feat: add glm and glm4 multipack and cce (#2546 ) * feat: add glm and glm4 multipack * feat: add glm4 example * feat: add cce for glm	2025-04-23 10:27:51 -04:00
NanoCode012	9da730d6a4	fix(doc): cut cross entropy installation instructions broken in qmd (#2532 )	2025-04-16 15:02:51 -07:00
NanoCode012	271b24cccc	feat: update cce to latest (#2521 )	2025-04-15 22:17:10 -07:00
NanoCode012	a6c03217f5	feat: add llama4 CCE (#2498 ) * feat: add llama4 CCE * fix: update model support list doc * feat: include llama4_text	2025-04-07 17:12:28 -04:00
NanoCode012	2c34a4634e	feat: add CCE for gemma3, cohere, and cohere2 (#2443 ) * feat: add CCE for gemma3 and cohere1/2 * fix: change from relative import to absolute * feat: add multipack for cohere&cohere2 * chore: improve comments * fix: add gemma3_text * feat: add cohere2 example * fix: cohere forward * fix: patch for cohere2 * feat: add command r v01 qlora sample * chore: lint * feat: upgrade gemma3 and gemma2 patch to use logits_to_keep * chore: lint * fix: add deprecate_kwarg decorator * fix: add cce for gemma3 conditionalgeneration * fix: gemma3 patch to defer logits calculation * fix: patch gemma3 if given as model * fix: remove not working config * fix: update comments to clarify changes * feat(doc): add supported models to readme * fix: address difference in our cohere patch * feat: add mistral3 * feat: add gemma * feat(doc): update README to include gemma and mistral3 in supported models * fix: gemma patch * fix: import * fix: gemma patch to be standalone * fix: gemma3 warn about not support final_logit_softcapping * feat: add mllama CCE * chore: add abbireviation to doc * fix: remove unneeded gemma3 eager warning * fix: save processor if available * fix: enable save processor on merge * fix: wrong env meaning	2025-03-26 18:13:51 -04:00
xzuyn	60a11a6410	Use Latest Cut Cross Entropy (#2392 ) * Update __init__.py * Update README.md * Update cutcrossentropy_install.py * add test	2025-03-10 16:26:40 +07:00
NanoCode012	d883b11b6f	fix(doc): add installation for cce to docs (#2375 ) [skip ci] * fix(doc): add installation for cce to docs * fix: format	2025-03-05 10:00:39 -05:00
NanoCode012	2efe1b4c09	Feat(doc): Reorganize documentation, fix broken syntax, update notes (#2348 ) * feat(doc): organize docs, add to menu bar, fix broken formatting * feat: add link to custom integrations * feat: update readme for integrations to include citations and repo link * chore: update lm_eval info * chore: use fullname * Update docs/cli.qmd per suggestion Co-authored-by: Dan Saunders <danjsaund@gmail.com> * feat: add sweep doc * feat: add kd doc * fix: remove toc * fix: update deprecation * feat: add more info about chat_template issues * fix: heading level * fix: shell->bash code block * fix: ray link * fix(doc): heading level, header links, formatting * feat: add grpo docs * feat: add style changes * fix: wrong cli arg for lm-eval * fix: remove old run method * feat: load custom integration doc dynamically * fix: remove old cli way * fix: toc * fix: minor formatting --------- Co-authored-by: Dan Saunders <danjsaund@gmail.com>	2025-02-25 16:09:37 +07:00
NanoCode012	bd8436bc6e	feat: add cut_cross_entropy (#2091 ) * feat: add cut_cross_entropy * fix: add to input * fix: remove from setup.py * feat: refactor into an integration * chore: ignore lint * feat: add test for cce * fix: set max_steps for liger test * chore: Update base model following suggestion Co-authored-by: Wing Lian <wing.lian@gmail.com> * chore: update special_tokens following suggestion Co-authored-by: Wing Lian <wing.lian@gmail.com> * chore: remove with_temp_dir following comments * fix: plugins aren't loaded * chore: update quotes in error message * chore: lint * chore: lint * feat: enable FA on test * chore: refactor get_pytorch_version * fix: lock cce commit version * fix: remove subclassing UT * fix: downcast even if not using FA and config check * feat: add test to check different attentions * feat: add install to CI * chore: refactor to use parametrize for attention * fix: pytest not detecting test * feat: handle torch lower than 2.4 * fix args/kwargs to match docs * use release version cut-cross-entropy==24.11.4 * fix quotes * fix: use named params for clarity for modal builder * fix: handle install from pip * fix: test check only top level module install * fix: re-add import check * uninstall existing version if no transformers submodule in cce * more dataset fixtures into the cache --------- Co-authored-by: Wing Lian <wing.lian@gmail.com> Co-authored-by: Wing Lian <wing@axolotl.ai>	2024-12-03 08:22:22 -05:00

11 Commits