* Extend MultiPackBatchSampler test to include shorter sequence length and drop long sequences filter
* Fix get_dataset_lengths for datasets that were previously filtered (e.g., with drop_long_seq_in_dataset)
* Update src/axolotl/utils/samplers/utils.py
Fix get_dataset_lengths for datasets that do not have position_ids or length attributes
Co-authored-by: NanoCode012 <kevinvong@rocketmail.com>
---------
Co-authored-by: NanoCode012 <kevinvong@rocketmail.com>