* new evals_per_epoch and saves_per_epoch to make things cleaner * update per PR feedback
* multipack support for official mixtral implementation * fix patch to load multipack for mixtral * chore: lint
* update to latest transformers for mixstral support * pin transformers * fix typo
* mixtral multipack * use mixtral model * sample yml * calculate cu_seqlens properly * use updated flash ettention setting * attn var checks * force use of flash attention 2 for packing * lint * disable future fix for now * update support table