NanoCode012
2c34a4634e
feat: add CCE for gemma3, cohere, and cohere2 (#2443)
* feat: add CCE for gemma3 and cohere1/2
* fix: change from relative import to absolute
* feat: add multipack for cohere&cohere2
* chore: improve comments
* fix: add gemma3_text
* feat: add cohere2 example
* fix: cohere forward
* fix: patch for cohere2
* feat: add command r v01 qlora sample
* chore: lint
* feat: upgrade gemma3 and gemma2 patch to use logits_to_keep
* chore: lint
* fix: add deprecate_kwarg decorator
* fix: add cce for gemma3 conditionalgeneration
* fix: gemma3 patch to defer logits calculation
* fix: patch gemma3 if given as model
* fix: remove not working config
* fix: update comments to clarify changes
* feat(doc): add supported models to readme
* fix: address difference in our cohere patch
* feat: add mistral3
* feat: add gemma
* feat(doc): update README to include gemma and mistral3 in supported models
* fix: gemma patch
* fix: import
* fix: gemma patch to be standalone
* fix: gemma3 warn about not support final_logit_softcapping
* feat: add mllama CCE
* chore: add abbireviation to doc
* fix: remove unneeded gemma3 eager warning
* fix: save processor if available
* fix: enable save processor on merge
* fix: wrong env meaning
2025-03-26 18:13:51 -04:00
..
2025-03-21 11:02:43 -04:00
2025-03-21 11:02:43 -04:00
2025-03-26 18:13:51 -04:00
2023-12-12 09:39:22 -08:00
2025-03-21 11:02:43 -04:00
2025-03-21 11:02:43 -04:00
2025-03-21 12:26:47 -04:00
2025-03-21 11:02:43 -04:00
2025-03-21 12:26:47 -04:00
2025-03-21 11:02:43 -04:00
2024-12-02 08:47:10 -05:00
2025-03-21 11:02:43 -04:00
2025-03-20 10:22:05 -04:00
2025-03-21 11:02:43 -04:00
2025-03-21 12:43:55 -04:00
2025-03-21 11:02:43 -04:00
2024-03-14 11:05:42 -04:00
2025-03-21 11:02:43 -04:00
2025-03-21 11:02:43 -04:00
2025-03-21 11:02:43 -04:00
2023-08-12 15:14:56 -04:00
2025-03-21 11:02:43 -04:00
2025-03-21 11:02:43 -04:00
2025-01-09 17:31:43 -05:00
2024-08-22 11:46:57 -04:00
2025-03-21 11:02:43 -04:00
2025-03-05 11:15:12 -05:00
2025-03-21 12:26:47 -04:00