feat: add CCE for gemma3, cohere, and cohere2 (#2443)
* feat: add CCE for gemma3 and cohere1/2 * fix: change from relative import to absolute * feat: add multipack for cohere&cohere2 * chore: improve comments * fix: add gemma3_text * feat: add cohere2 example * fix: cohere forward * fix: patch for cohere2 * feat: add command r v01 qlora sample * chore: lint * feat: upgrade gemma3 and gemma2 patch to use logits_to_keep * chore: lint * fix: add deprecate_kwarg decorator * fix: add cce for gemma3 conditionalgeneration * fix: gemma3 patch to defer logits calculation * fix: patch gemma3 if given as model * fix: remove not working config * fix: update comments to clarify changes * feat(doc): add supported models to readme * fix: address difference in our cohere patch * feat: add mistral3 * feat: add gemma * feat(doc): update README to include gemma and mistral3 in supported models * fix: gemma patch * fix: import * fix: gemma patch to be standalone * fix: gemma3 warn about not support final_logit_softcapping * feat: add mllama CCE * chore: add abbireviation to doc * fix: remove unneeded gemma3 eager warning * fix: save processor if available * fix: enable save processor on merge * fix: wrong env meaning
This commit is contained in:
@@ -408,7 +408,7 @@ def test_kernel_training_integration():
|
||||
)
|
||||
|
||||
# Load model
|
||||
model, _ = load_model_and_tokenizer(cfg=cfg)
|
||||
model, _, _ = load_model_and_tokenizer(cfg=cfg)
|
||||
|
||||
# Verify correct activation function
|
||||
layer = model.model.model.layers[0]
|
||||
|
||||
Reference in New Issue
Block a user