salman
58d67bf98d
Migrate QAT API; fix axolotl quantize for QAT-ed models; add NVFP4 ( #3107 )
2025-09-12 10:55:50 +01:00
Dan Saunders
79ddaebe9a
Add ruff, remove black, isort, flake8, pylint ( #3092 )
...
* black, isort, flake8 -> ruff
* remove unused
* add back needed import
* fix
2025-08-23 23:37:33 -04:00
Dan Saunders
10ba1622f7
checkpoint model on first step callback ( #2906 )
...
* checkpoint model on first step callback
* remove debug
* add test cases; update existing tests not to save on first step
* move test out of solo
* delete
* default to False
* typo
2025-07-15 15:00:48 -04:00
Wing Lian
a85efffbef
bump transformers==4.52.4 ( #2800 ) [skip ci]
...
* bump transformers==4.52.4
* don't use hf offline for qwen tokenizer
* increase timeout
* don't use methodtype
* increase timeout
* better assertion logging
* upgrade deepspeed version too
2025-06-18 15:46:14 -04:00
Wing Lian
b2274d430b
support for QAT w RL (DPO) ( #2776 )
2025-06-13 10:00:35 -04:00
Dan Saunders
00cda8cc70
Data loader refactor ( #2707 )
...
* data loading refactor (wip)
* updates
* progress
* pytest
* pytest fix
* lint
* zero_first -> filelock, more simplifications
* small simplification
* import change
* nit
* lint
* simplify dedup
* couldnt resist
* review comments WIP
* continued wip
* minor changes
* fix; remove contrived test
* further refactor
* set default seed in pydantic config
* lint
* continued simplication
* lint
* renaming and nits
* filelock tests
* fix
* fix
* lint
* remove nullable arg
* remove unnecessary code
* moving dataset save fn to shared module
* remove debug print
* matching var naming
* fn name change
* coderabbit comments
* naming nit
* fix test
2025-06-10 19:53:07 -04:00
salman
5fca214108
QAT ( #2590 )
...
QAT and quantization w/torchao
2025-05-28 12:35:47 +01:00