Files
axolotl/examples/jamba
Gal Cohen (galco) 9f917245f6 feat: add jamba chat_template (#1843)
* feat: add jamba chat_template

* fix: black

* feat: jamba fsdp+qlora

---------

Co-authored-by: Gal Cohen <galc@ai21.com>
2024-08-21 13:37:17 -04:00
..

Jamba

  • qlora w/ deepspeed Zero-2 needs at least 2x GPUs and
    • 35GiB VRAM per GPU w minimal context length
    • 56GiB VRAM per GPU (w multipack enabled)
  • qlora w/ deepspeed Zero-3 needs at least 2x GPUs and 67GiB VRAM (wtf?)
  • qlora single-gpu, ~51GiB VRAM
  • multipack
  • FSDP
  • 8-bit LoRA