* Add CausalLMBenchEvalCallback for measuring seq2seq performance
* Fix code for pre-commit
* Fix typing and improve logging
* eval_sample_packing must be false with CausalLMBenchEvalCallback
* set fp16 to false if bf16, update bf16: auto in example YAMLs
* unset fp16 so that it fallsback properly if bf16 isn't available
* Update README.md [skip-ci]
Co-authored-by: NanoCode012 <kevinvong@rocketmail.com>
* test that bf16 disables fp16
---------
Co-authored-by: NanoCode012 <kevinvong@rocketmail.com>
* add check for zero3
* freeze parameters
* fixes for deepspeed loading
* fix model parameter check
* unfrozen parameters in example mixtral and logging when unfreezing
* mixtral multipack
* use mixtral model
* sample yml
* calculate cu_seqlens properly
* use updated flash ettention setting
* attn var checks
* force use of flash attention 2 for packing
* lint
* disable future fix for now
* update support table