NanoCode012
427e612d5a
feat: allow custom optim for rl methods
2025-05-14 09:36:20 +07:00
NanoCode012
b8025b34b9
fix: lint
2025-05-14 09:33:49 +07:00
NanoCode012
51c2adf3b1
fix: remove redundant override
2025-05-14 09:33:49 +07:00
Wing Lian
cbcb7b081b
use transformers default for logging steps, not None
2025-05-14 09:33:49 +07:00
Wing Lian
675561e745
improve handling of warmup/logging steps
2025-05-14 09:33:49 +07:00
NanoCode012
a6ce7d7522
fix: comments
2025-05-14 09:33:49 +07:00
NanoCode012
1ea6ce73ed
feat: update CI on trainer_builder
2025-05-14 09:33:49 +07:00
NanoCode012
8aa722a140
fix: ignore max_length for grpo
2025-05-14 09:33:49 +07:00
NanoCode012
edaec9fe98
fix: add missing weight_decay handling
2025-05-14 09:33:28 +07:00
NanoCode012
8b6db0c72d
fix: update default max_steps
2025-05-14 09:33:28 +07:00
NanoCode012
43f5373c79
fix: remove unnecessary datacollator kwarg insert and pop
2025-05-14 09:33:28 +07:00
NanoCode012
698268bc63
fix: max_steps incorrectly set
2025-05-14 09:33:28 +07:00
NanoCode012
9028eb2758
fix: adding missing Any
2025-05-14 09:33:28 +07:00
NanoCode012
077a54d2b1
fix: deprecate old types
2025-05-14 09:33:28 +07:00
NanoCode012
053e5fd7d1
chore: consolidate eval_strat, loraplus, lr sched, max_length
2025-05-14 09:33:28 +07:00
NanoCode012
fd271b2547
fix: consolidate handling of fp16, bf16, tf32 kwarg
2025-05-14 09:33:28 +07:00
NanoCode012
c268a0157a
feat: add report_to to set run name
2025-05-14 09:33:28 +07:00
NanoCode012
6317945b67
fix: refactor sft and rl trainer to set same base args
2025-05-14 09:32:46 +07:00
NanoCode012
86ba574698
feat: add num_proc and load from cache for rl mapping
2025-05-14 09:32:04 +07:00
Wing Lian
7fa1089cea
Atropos support ( #2666 ) [skip ci]
...
* allow peft+liger+grpo and custom vllm serve for atropos support
* set trainer class for RL
2025-05-13 08:30:58 -04:00
Dan Saunders
80304c26a7
SP GRPO support + batch SP fixes ( #2643 )
...
* ctx manager for SP
* updates
* update
* further simplifying
* simplifying
* simplifying
* reorg
* batch api HF adapter for ring-flash-attn; cleanup and improvements
* update
* adding all batch ring-flash-attn methods via single adapter
* fix
* fixes for batch API funcs, simplify
* fix
* grpo sp support
* progress
* stronger subclassing of TRL GRPO trainer; custom distributed sampler
* subclassing constructor
* progress
* finalizing SP + GRPO trainer
* minimize diffs to GRPO trainer
* remove (most of) the custom GRPO trainer logic
* debug
* debug
* update
* update
* update
* progress
* cleanup
* cleanup
* minor changes
* update
* update
* update
* small changes
* updates
* cleanup; torch.compile ring_flash_attn functions to prevent numerical instability; lint
* spacing
* cleanup; log in pydantic model config only on main process
* remove comment
* fix sp sampler, update to latest upstream code, doc
* add docs
* update quartodoc autodoc contents
* fix, simplifications
* fixes + simplifications
* review comments
* lint
* removing main process only logs in favor of #2608
* fixes, additional smoke test
* updatse
* more tests
* update
* fix grad accum bug (sort of)
* lint, tests
* todo
2025-05-12 17:52:40 -04:00
NanoCode012
67c4ea9c7c
fix: disable auto lora kernel if dropout nonzero ( #2655 ) [skip ci]
...
* fix: disable auto lora kernel if dropout nonzero
* Add comment from PR feedback
---------
Co-authored-by: Wing Lian <wing@axolotl.ai >
2025-05-12 16:23:53 -04:00
Wing Lian
526ddb886d
guard on deleting secrets from env ( #2653 ) [skip ci]
2025-05-12 14:18:42 -04:00
Wing Lian
f34eef546a
update doc and use P2P=LOC for brittle grpo test ( #2649 )
...
* update doc and skip brittle grpo test
* fix the path to run the multigpu tests
* increase timeout, use LOC instead of NVL
* typo
* use hf cache from s3 backed cloudfront
* mark grpo as flaky test dues to vllm start
2025-05-12 14:17:25 -04:00
Wing Lian
c7b6790614
Various fixes for CI, save_only_model for RL, prevent packing multiprocessing deadlocks ( #2661 )
...
* lean mistral ft tests, remove e2e torch 2.4.1 test
* make sure to pass save_only_model for RL
* more tests to make ci leaner, add cleanup to modal ci
* fix module for import in e2e tests
* use mp spawn to prevent deadlocks with packing
* make sure cleanup shell script is executable when cloned out
2025-05-12 10:51:18 -04:00
Dan Saunders
47e0e71bc8
don't sort multipack sampler ( #2657 )
...
* don't sort multipack sampler
* increased packing efficiency increases loss
---------
Co-authored-by: Wing Lian <wing@axolotl.ai >
2025-05-09 20:28:58 -04:00
Wing Lian
0f3587174d
swap tinymodels that have safetensors for some ci tests ( #2641 )
2025-05-07 15:06:07 -04:00
xzuyn
25e6c5f9bd
Add CAME Optimizer ( #2385 )
2025-05-07 10:31:46 -04:00
NanoCode012
32f51bca35
fix(doc): clarify instruction to delinearize llama4 similar to cli doc ( #2644 ) [skip ci]
2025-05-07 10:29:47 -04:00
NanoCode012
9daa04da90
Fix: improve error message on failed dataset load ( #2637 ) [skip ci]
...
* fix(log): clarify error on dataset loading failed
* fix: add path for easy tracking of broken config
* fix: improve error message based on pr feedback
2025-05-07 10:29:05 -04:00
Wing Lian
0d71b0aa5f
Configurable embeddings upcast ( #2621 )
...
* fsdp embeddings should be float32 per comment
* patch peft to not upcast everything
* add tabs back to code check
* fix import
* add configurable option and fix check
* add check for dtypes
* move embeddings test to patch dir
* fix test
* fix comment and logic
2025-05-06 23:40:44 -04:00
Eric Meier
63aaccf85b
Fix cut_cross_entropy plugin install ( #2642 ) [skip ci]
2025-05-06 22:56:00 -04:00
Wing Lian
ff0fe767c8
xformers attention with packing ( #2619 )
...
* xformers attention with packing
* wire up the patch
* fix xformers + packing validation
* fix warning
* reorder the packing check
* fix fp16 / bf16 reset when using fp16 with bf16 auto
* fix seq lens calc to drop hanging sequences
* handle xformers patch for inference too
* fix batch size setter
* fix xformers inference
* add colab callback to fix inference post train
* PR feedback
2025-05-06 22:49:22 -04:00
Wing Lian
8e4158cc0b
Multipack parallel bin packing ( #2631 )
...
* improve readability of multipack sampler
* parallel bin packing
fix error with lambda and pickling
make sure things are in float instead of np.float
* annotations and comments update
* support for configurable group and bin size for sample packing
* fix missing map back to original indices
2025-05-06 20:08:08 -04:00
Wing Lian
cd84325253
allow plugins to return their own dataset ( #2617 ) [skip ci]
...
* allow plugins to return their own dataset
* add post_trainer_create and wire up
* add hook check
* address PR feedback:
* remove annotation causing circular import
2025-05-06 20:05:51 -04:00
NanoCode012
0b140fef83
feat(doc): add split_thinking docs ( #2613 ) [skip ci]
...
* feat(doc): add split_thinking docs
* fix: link config.qmd to conversation.qmd for split_thinking example
* update thinking => reasoning_content in messages format
---------
Co-authored-by: Wing Lian <wing@axolotl.ai >
2025-05-06 20:05:32 -04:00
Wing Lian
e4cfebe995
bump liger dep to 0.5.9 ( #2640 ) [skip ci]
...
* bump liger dep to 0.5.9
* also upgrade vllm to post1, and datasets to 3.5.1
2025-05-06 20:05:19 -04:00
mhenrichsen
a6cac5dd32
Update lr_scheduler options in config.qmd to include additional scheduling strategies for improved training flexibility. ( #2636 ) [skip ci]
2025-05-06 11:24:07 -04:00
Wing Lian
b71c0e3447
Print axolotl art if train is called outside of cli: ( #2627 ) [skip ci]
2025-05-06 11:18:45 -04:00
Wing Lian
ddaebf8309
fix dpo eval override to call grandparent instead of the broken super ( #2628 ) [skip ci]
2025-05-06 11:18:25 -04:00
Wing Lian
679743087a
make sure gc_steps is used for all trainers ( #2638 )
2025-05-06 11:18:00 -04:00
Wing Lian
f720b6e72d
repop cache ( #2639 )
...
* repop cache
* pre-cache as a step
* fix the name
* add reason for pytest skipif
* restore pytorch matrix
* remove max-parallel now that we've optimized this a bit
2025-05-06 11:09:07 -04:00
mhenrichsen
a980618fd0
Adds example for training a TTS model on top of a LLM. ( #2614 )
...
* Adds example for training a TTS model on top of a LLM.
* Update examples/orpheus/finetune.yml
Co-authored-by: NanoCode012 <nano@axolotl.ai >
* Update examples/orpheus/finetune.yml
Co-authored-by: NanoCode012 <nano@axolotl.ai >
* Update README.md to clarify GPU requirements for finetuning Orpheus TTS model
* Update finetune.yml to use the new base model canopylabs/orpheus-3b-0.1-pretrained
* Update finetune.yml and README.md for consistency and clarity
---------
Co-authored-by: NanoCode012 <nano@axolotl.ai >
2025-05-06 10:11:06 +02:00
Emmanuel Ferdman
54960d4de0
Fix logging deprecation warnings ( #2623 )
...
Signed-off-by: Emmanuel Ferdman <emmanuelferdman@gmail.com >
2025-05-04 08:22:45 -04:00
Wing Lian
ed922796b7
include multipack support for qwen3 family ( #2622 )
2025-05-03 12:02:39 -04:00
Wing Lian
3dd9c3bf3f
setup hf transfer too and fix auto bf16 when fp16 enabled ( #2620 ) [skip ci]
2025-05-03 12:02:26 -04:00
Wing Lian
0ba7d362fa
qwen3 and qwen3_moe support for liger kernels ( #2612 )
...
* qwen3 and qwen3_moe support for liger kernels
* fix moe module path
* fix: qwen3 liger input args and mlp
* fix: qwen3 input args and output class
---------
Co-authored-by: NanoCode012 <nano@axolotl.ai >
2025-05-02 09:29:55 -04:00
aitechguy
e4f73bc98e
remove keys to incoporate changes for the trl update ( #2616 )
2025-05-02 08:47:42 -04:00
Wing Lian
bcb59c70e2
automatically set pad_to_sequence_len when use packing ( #2607 )
...
* automatically set pad_to_sequence_len when use packing
* update tests
2025-05-01 13:24:38 -04:00
NanoCode012
6a3e6f8c53
fix: run preview-docs only when md/qmd changes ( #2606 )
...
* fix: run preview-docs only when md/qmd changes
* feat: add quarto yaml based on PR feedback
2025-05-01 13:21:28 -04:00