Dan Saunders
79ddaebe9a
Add ruff, remove black, isort, flake8, pylint ( #3092 )
...
* black, isort, flake8 -> ruff
* remove unused
* add back needed import
* fix
2025-08-23 23:37:33 -04:00
Dan Saunders
10ba1622f7
checkpoint model on first step callback ( #2906 )
...
* checkpoint model on first step callback
* remove debug
* add test cases; update existing tests not to save on first step
* move test out of solo
* delete
* default to False
* typo
2025-07-15 15:00:48 -04:00
Dan Saunders
00cda8cc70
Data loader refactor ( #2707 )
...
* data loading refactor (wip)
* updates
* progress
* pytest
* pytest fix
* lint
* zero_first -> filelock, more simplifications
* small simplification
* import change
* nit
* lint
* simplify dedup
* couldnt resist
* review comments WIP
* continued wip
* minor changes
* fix; remove contrived test
* further refactor
* set default seed in pydantic config
* lint
* continued simplication
* lint
* renaming and nits
* filelock tests
* fix
* fix
* lint
* remove nullable arg
* remove unnecessary code
* moving dataset save fn to shared module
* remove debug print
* matching var naming
* fn name change
* coderabbit comments
* naming nit
* fix test
2025-06-10 19:53:07 -04:00
Dan Saunders
1d91d905c9
remove deprecated wandb env var ( #2751 )
...
* remove deprecated wandb env var
* remove os.environ wandb setting; unused loggers
* remove os.environ wandb setting; unused loggers
2025-06-03 14:04:15 -07:00
salman
65c5481120
Rank 0-only logging ( #2608 )
...
Co-authored-by: Wing Lian <wing@axolotl.ai >
2025-05-28 14:57:30 +01:00
Wing Lian
40f4ea23ab
replace references to random 68m model w 135m smollm2 ( #2570 ) [skip ci]
...
* replace references to random 68m model w 135m smollm2
* use AutoTokenizer for smollm2
2025-04-28 10:08:07 -04:00
Wing Lian
de8a625dd7
make e2e tests a bit faster by reducing test split size ( #2522 ) [skip ci]
...
* [ci] make e2e tests a bit faster by reducing test split size
* use 10% split of alpaca dataset to speed up dataset loading/tokenization
* reduce gas 4->2 for most e2e tests
* increase val set size for packing
2025-04-12 07:24:43 -07:00
NanoCode012
9f00465a5c
Feat: Add support for gemma3_text and add e2e for gemma2 ( #2406 )
2025-03-22 20:33:21 -04:00
xzuyn
0134093acc
Add REX LR Scheduler ( #2380 )
...
* Update trainer_builder.py
* Update base.py
* Update __init__.py
* Update base.py
* Update base.py
* Update config.qmd
* Update base.py
* Update base.py
* Update base.py
* Update base.py
* Update base.py
* Update base.py
* Update base.py
* lint
* lint
* lint
* lint
* lint
* lint
* Update base.py
* Update base.py
* lint
* Update base.py
* Update base.py
* Move RexLR to `schedulers.py`
* Remove RexLR from `base.py`
* Fix tooltip formatting
* lint
* Create test_schedulers.py
* Use a default optimizer in test
* lint
* lint
* Add `warmup_steps` and `cosine_min_lr_ratio` to test
* lint
2025-03-05 10:26:11 -05:00