Wing Lian
|
a85efffbef
|
bump transformers==4.52.4 (#2800) [skip ci]
* bump transformers==4.52.4
* don't use hf offline for qwen tokenizer
* increase timeout
* don't use methodtype
* increase timeout
* better assertion logging
* upgrade deepspeed version too
|
2025-06-18 15:46:14 -04:00 |
|
Dan Saunders
|
1d91d905c9
|
remove deprecated wandb env var (#2751)
* remove deprecated wandb env var
* remove os.environ wandb setting; unused loggers
* remove os.environ wandb setting; unused loggers
|
2025-06-03 14:04:15 -07:00 |
|
salman
|
65c5481120
|
Rank 0-only logging (#2608)
Co-authored-by: Wing Lian <wing@axolotl.ai>
|
2025-05-28 14:57:30 +01:00 |
|
Wing Lian
|
baeb00231b
|
Handle other reasoning trace dataset formats (#2591)
* Handle other reasoning trace dataset formats
* rename var to improve readability
* chore: refactor with comments
---------
Co-authored-by: NanoCode012 <nano@axolotl.ai>
|
2025-04-30 03:32:55 -04:00 |
|
Wing Lian
|
2d77165dc0
|
automatically split out reasoning trace from dataset (#2579)
* automatically split out reasoning trace from dataset
* chore: lint
* fix import
|
2025-04-28 18:23:03 -04:00 |
|