feat: add CCE for gemma3, cohere, and cohere2 (#2443)

* feat: add CCE for gemma3 and cohere1/2

* fix: change from relative import to absolute

* feat: add multipack for cohere&cohere2

* chore: improve comments

* fix: add gemma3_text

* feat: add cohere2 example

* fix: cohere forward

* fix: patch for cohere2

* feat: add command r v01 qlora sample

* chore: lint

* feat: upgrade gemma3 and gemma2 patch to use logits_to_keep

* chore: lint

* fix: add deprecate_kwarg decorator

* fix: add cce for gemma3 conditionalgeneration

* fix: gemma3 patch to defer logits calculation

* fix: patch gemma3 if given as model

* fix: remove not working config

* fix: update comments to clarify changes

* feat(doc): add supported models to readme

* fix: address difference in our cohere patch

* feat: add mistral3

* feat: add gemma

* feat(doc): update README to include gemma and mistral3 in supported models

* fix: gemma patch

* fix: import

* fix: gemma patch to be standalone

* fix: gemma3 warn about not support final_logit_softcapping

* feat: add mllama CCE

* chore: add abbireviation to doc

* fix: remove unneeded gemma3 eager warning

* fix: save processor if available

* fix: enable save processor on merge

* fix: wrong env meaning
This commit is contained in:
NanoCode012
2025-03-27 05:13:51 +07:00
committed by GitHub
parent a9b0733f2c
commit 2c34a4634e
16 changed files with 1826 additions and 15 deletions

View File

@@ -23,6 +23,8 @@ SUPPORTED_MULTIPACK_MODEL_TYPES = [
"gemma",
"gemma2",
"gemma3_text",
"cohere",
"cohere2",
"gemmoe",
"starcoder2",
"deepseek_v2",