Files

Wing Lian 9f824ef76a simplify the example configs to be more minimal and less daunting (#2486 ) [skip ci]

* simplify the example configs to be more minimal and less daunting

* drop empty s2_attention from example yamls

2025-04-04 13:47:26 -04:00

qlora_deepspeed.yaml

2025-04-04 13:47:26 -04:00

qlora_fsdp_large.yaml

2025-04-04 13:47:26 -04:00

qlora.yaml

2025-04-04 13:47:26 -04:00

README.md

2024-08-22 09:22:55 -04:00

Jamba

✅ qlora w/ deepspeed Zero-2 needs at least 2x GPUs and
- 35GiB VRAM per GPU w minimal context length
- 56GiB VRAM per GPU (w multipack enabled)
✅ qlora w/ deepspeed Zero-3 needs at least 2x GPUs and 67GiB VRAM (wtf?)
✅ qlora single-gpu, ~51GiB VRAM
✅ multipack
✅ FSDP
❓ 8-bit LoRA