* add check for zero3
* freeze parameters
* fixes for deepspeed loading
* fix model parameter check
* unfrozen parameters in example mixtral and logging when unfreezing
* mixtral multipack
* use mixtral model
* sample yml
* calculate cu_seqlens properly
* use updated flash ettention setting
* attn var checks
* force use of flash attention 2 for packing
* lint
* disable future fix for now
* update support table