Files
axolotl/src/axolotl
Tobias 8dfadc2b3c Fix sample packing producing longer sequences than specified by sequence_len (#2332)
* Extend MultiPackBatchSampler test to include shorter sequence length and drop long sequences filter

* Fix get_dataset_lengths for datasets that were previously filtered (e.g., with drop_long_seq_in_dataset)

* Update src/axolotl/utils/samplers/utils.py

Fix get_dataset_lengths for datasets that do not have position_ids or length attributes

Co-authored-by: NanoCode012 <kevinvong@rocketmail.com>

---------

Co-authored-by: NanoCode012 <kevinvong@rocketmail.com>
2025-02-19 12:02:35 +07:00
..
2025-02-13 16:01:01 -05:00
2024-01-23 12:54:36 -05:00
2025-02-18 04:30:59 -05:00
2023-05-31 02:53:53 +09:00
2025-01-31 20:18:52 -05:00