NanoCode012
555190868a
fix: import path for trainer builder and submodules
2025-05-15 15:49:37 +07:00
NanoCode012
a1832953c4
fix: update quarto autodoc
2025-05-15 15:39:47 +07:00
NanoCode012
930472b7c7
chore: add missing config to doc
2025-05-14 16:56:34 +07:00
NanoCode012
8a336a2c33
fix: remove deprecated clause
2025-05-14 16:55:34 +07:00
NanoCode012
316b450a87
feat: split training builder into sub modules
2025-05-14 16:53:50 +07:00
NanoCode012
c281c6e519
fix(test): use RLType directly to skip needing to validate
2025-05-14 16:17:34 +07:00
NanoCode012
06fae0d34e
chore: add back return typing from rebase
2025-05-14 10:39:46 +07:00
NanoCode012
67b1df21aa
feat: add handling for seed and SP/ring-attn config
2025-05-14 09:49:46 +07:00
NanoCode012
9af4bffd5d
fix(test): set sequence_parallel_degree default in base cfg
2025-05-14 09:36:20 +07:00
NanoCode012
7c91cbddd3
fix: duplicate optim setting
2025-05-14 09:36:20 +07:00
NanoCode012
427e612d5a
feat: allow custom optim for rl methods
2025-05-14 09:36:20 +07:00
NanoCode012
b8025b34b9
fix: lint
2025-05-14 09:33:49 +07:00
NanoCode012
51c2adf3b1
fix: remove redundant override
2025-05-14 09:33:49 +07:00
Wing Lian
cbcb7b081b
use transformers default for logging steps, not None
2025-05-14 09:33:49 +07:00
Wing Lian
675561e745
improve handling of warmup/logging steps
2025-05-14 09:33:49 +07:00
NanoCode012
a6ce7d7522
fix: comments
2025-05-14 09:33:49 +07:00
NanoCode012
1ea6ce73ed
feat: update CI on trainer_builder
2025-05-14 09:33:49 +07:00
NanoCode012
8aa722a140
fix: ignore max_length for grpo
2025-05-14 09:33:49 +07:00
NanoCode012
edaec9fe98
fix: add missing weight_decay handling
2025-05-14 09:33:28 +07:00
NanoCode012
8b6db0c72d
fix: update default max_steps
2025-05-14 09:33:28 +07:00
NanoCode012
43f5373c79
fix: remove unnecessary datacollator kwarg insert and pop
2025-05-14 09:33:28 +07:00
NanoCode012
698268bc63
fix: max_steps incorrectly set
2025-05-14 09:33:28 +07:00
NanoCode012
9028eb2758
fix: adding missing Any
2025-05-14 09:33:28 +07:00
NanoCode012
077a54d2b1
fix: deprecate old types
2025-05-14 09:33:28 +07:00
NanoCode012
053e5fd7d1
chore: consolidate eval_strat, loraplus, lr sched, max_length
2025-05-14 09:33:28 +07:00
NanoCode012
fd271b2547
fix: consolidate handling of fp16, bf16, tf32 kwarg
2025-05-14 09:33:28 +07:00
NanoCode012
c268a0157a
feat: add report_to to set run name
2025-05-14 09:33:28 +07:00
NanoCode012
6317945b67
fix: refactor sft and rl trainer to set same base args
2025-05-14 09:32:46 +07:00
NanoCode012
86ba574698
feat: add num_proc and load from cache for rl mapping
2025-05-14 09:32:04 +07:00
Wing Lian
7fa1089cea
Atropos support ( #2666 ) [skip ci]
...
* allow peft+liger+grpo and custom vllm serve for atropos support
* set trainer class for RL
2025-05-13 08:30:58 -04:00
Dan Saunders
80304c26a7
SP GRPO support + batch SP fixes ( #2643 )
...
* ctx manager for SP
* updates
* update
* further simplifying
* simplifying
* simplifying
* reorg
* batch api HF adapter for ring-flash-attn; cleanup and improvements
* update
* adding all batch ring-flash-attn methods via single adapter
* fix
* fixes for batch API funcs, simplify
* fix
* grpo sp support
* progress
* stronger subclassing of TRL GRPO trainer; custom distributed sampler
* subclassing constructor
* progress
* finalizing SP + GRPO trainer
* minimize diffs to GRPO trainer
* remove (most of) the custom GRPO trainer logic
* debug
* debug
* update
* update
* update
* progress
* cleanup
* cleanup
* minor changes
* update
* update
* update
* small changes
* updates
* cleanup; torch.compile ring_flash_attn functions to prevent numerical instability; lint
* spacing
* cleanup; log in pydantic model config only on main process
* remove comment
* fix sp sampler, update to latest upstream code, doc
* add docs
* update quartodoc autodoc contents
* fix, simplifications
* fixes + simplifications
* review comments
* lint
* removing main process only logs in favor of #2608
* fixes, additional smoke test
* updatse
* more tests
* update
* fix grad accum bug (sort of)
* lint, tests
* todo
2025-05-12 17:52:40 -04:00
NanoCode012
67c4ea9c7c
fix: disable auto lora kernel if dropout nonzero ( #2655 ) [skip ci]
...
* fix: disable auto lora kernel if dropout nonzero
* Add comment from PR feedback
---------
Co-authored-by: Wing Lian <wing@axolotl.ai >
2025-05-12 16:23:53 -04:00
Wing Lian
526ddb886d
guard on deleting secrets from env ( #2653 ) [skip ci]
2025-05-12 14:18:42 -04:00
Wing Lian
f34eef546a
update doc and use P2P=LOC for brittle grpo test ( #2649 )
...
* update doc and skip brittle grpo test
* fix the path to run the multigpu tests
* increase timeout, use LOC instead of NVL
* typo
* use hf cache from s3 backed cloudfront
* mark grpo as flaky test dues to vllm start
2025-05-12 14:17:25 -04:00
Wing Lian
c7b6790614
Various fixes for CI, save_only_model for RL, prevent packing multiprocessing deadlocks ( #2661 )
...
* lean mistral ft tests, remove e2e torch 2.4.1 test
* make sure to pass save_only_model for RL
* more tests to make ci leaner, add cleanup to modal ci
* fix module for import in e2e tests
* use mp spawn to prevent deadlocks with packing
* make sure cleanup shell script is executable when cloned out
2025-05-12 10:51:18 -04:00
Dan Saunders
47e0e71bc8
don't sort multipack sampler ( #2657 )
...
* don't sort multipack sampler
* increased packing efficiency increases loss
---------
Co-authored-by: Wing Lian <wing@axolotl.ai >
2025-05-09 20:28:58 -04:00
Wing Lian
0f3587174d
swap tinymodels that have safetensors for some ci tests ( #2641 )
2025-05-07 15:06:07 -04:00
xzuyn
25e6c5f9bd
Add CAME Optimizer ( #2385 )
2025-05-07 10:31:46 -04:00
NanoCode012
32f51bca35
fix(doc): clarify instruction to delinearize llama4 similar to cli doc ( #2644 ) [skip ci]
2025-05-07 10:29:47 -04:00
NanoCode012
9daa04da90
Fix: improve error message on failed dataset load ( #2637 ) [skip ci]
...
* fix(log): clarify error on dataset loading failed
* fix: add path for easy tracking of broken config
* fix: improve error message based on pr feedback
2025-05-07 10:29:05 -04:00
Wing Lian
0d71b0aa5f
Configurable embeddings upcast ( #2621 )
...
* fsdp embeddings should be float32 per comment
* patch peft to not upcast everything
* add tabs back to code check
* fix import
* add configurable option and fix check
* add check for dtypes
* move embeddings test to patch dir
* fix test
* fix comment and logic
2025-05-06 23:40:44 -04:00
Eric Meier
63aaccf85b
Fix cut_cross_entropy plugin install ( #2642 ) [skip ci]
2025-05-06 22:56:00 -04:00
Wing Lian
ff0fe767c8
xformers attention with packing ( #2619 )
...
* xformers attention with packing
* wire up the patch
* fix xformers + packing validation
* fix warning
* reorder the packing check
* fix fp16 / bf16 reset when using fp16 with bf16 auto
* fix seq lens calc to drop hanging sequences
* handle xformers patch for inference too
* fix batch size setter
* fix xformers inference
* add colab callback to fix inference post train
* PR feedback
2025-05-06 22:49:22 -04:00
Wing Lian
8e4158cc0b
Multipack parallel bin packing ( #2631 )
...
* improve readability of multipack sampler
* parallel bin packing
fix error with lambda and pickling
make sure things are in float instead of np.float
* annotations and comments update
* support for configurable group and bin size for sample packing
* fix missing map back to original indices
2025-05-06 20:08:08 -04:00
Wing Lian
cd84325253
allow plugins to return their own dataset ( #2617 ) [skip ci]
...
* allow plugins to return their own dataset
* add post_trainer_create and wire up
* add hook check
* address PR feedback:
* remove annotation causing circular import
2025-05-06 20:05:51 -04:00
NanoCode012
0b140fef83
feat(doc): add split_thinking docs ( #2613 ) [skip ci]
...
* feat(doc): add split_thinking docs
* fix: link config.qmd to conversation.qmd for split_thinking example
* update thinking => reasoning_content in messages format
---------
Co-authored-by: Wing Lian <wing@axolotl.ai >
2025-05-06 20:05:32 -04:00
Wing Lian
e4cfebe995
bump liger dep to 0.5.9 ( #2640 ) [skip ci]
...
* bump liger dep to 0.5.9
* also upgrade vllm to post1, and datasets to 3.5.1
2025-05-06 20:05:19 -04:00
mhenrichsen
a6cac5dd32
Update lr_scheduler options in config.qmd to include additional scheduling strategies for improved training flexibility. ( #2636 ) [skip ci]
2025-05-06 11:24:07 -04:00
Wing Lian
b71c0e3447
Print axolotl art if train is called outside of cli: ( #2627 ) [skip ci]
2025-05-06 11:18:45 -04:00
Wing Lian
ddaebf8309
fix dpo eval override to call grandparent instead of the broken super ( #2628 ) [skip ci]
2025-05-06 11:18:25 -04:00