* make pad_to_sequence_len default to the same value as sample_packing
* remove duplicate validation
* fix test
* update description meta
Co-authored-by: NanoCode012 <nano@axolotl.ai>
---------
Co-authored-by: NanoCode012 <nano@axolotl.ai>
* checkpoint model on first step callback
* remove debug
* add test cases; update existing tests not to save on first step
* move test out of solo
* delete
* default to False
* typo
* feat: add fsdp config for magistral
* fix: add mllama self attention handling for lora kernels
* fix: no eval if val_set_size 0 despite having test_datasets
* fix: add note for cce for vlm in newer model