* bump hf deps
* upgrade liger-kernel too
* install cce from fork for transformers fix
* fix reference to vocab size in gemma3 patch
* use padding_idx instead of pad_token_id
* remove fixed gemma3 patch
* use updated cce fork
* fix local mllama cce patches w docstring
* add test for multipack with trainer setup and fix trainer for trainer refactor upstream
* bump modal version
* guard for iterable datasetS
* mllama model arch layout changed in latest transformers
* fix batch sampler with drop_last
* fix: address upstream vlm changes for lora
* fix: update references to old lora target path
* fix: remove mllama fa2 patch due to upstream fix
* fix: lora kernel patch path for multimodal models
* fix: removed mllama from quarto
* run test for came optim on 2.6.0+
* fix fsdp2 patch and remove deprecated patch
* make sure to set sequence_parallel_degree for grpo
* Add SP test for GRPO
* add sp to grpo config for trainer
* use reward_funcs as kwarg to grpo trainer
* fix the comprehension for reward funcs
* reward funcs already passed in as args
* init sp_group right before training
* fix check for adding models to SP context
* make sure to pass args to super
* upgrade deepspeed
* use updated trl and add reasoning flags for vllm
* patch the worker
---------
Co-authored-by: NanoCode012 <nano@axolotl.ai>
* chore: update pre-commit hooks
* trigger linter when pre commit hooks are updated
* fix type checks from upgraded pre-commit
---------
Co-authored-by: djsaunde <1245942+djsaunde@users.noreply.github.com>
Co-authored-by: Wing Lian <wing@axolotl.ai>
* fix: increase log level for root loggers and axolotl's
* fix: BasePlugin using wrong logger
* fix: update logger to take name from module
* feat: change logger class to AxolotlLogger to filter non-axolotl infos or below
* fix: change behavior to not disable existing loggers
* fix: update logging to respect correct env
* chore: fix comment
* fix: suppress accelerate log to LOG_LEVEL if not set
---------
Co-authored-by: salman <salman.mohammadi@outlook.com>
* feat: add num_proc and load from cache for rl mapping
* fix: refactor sft and rl trainer to set same base args
* feat: add report_to to set run name
* fix: consolidate handling of fp16, bf16, tf32 kwarg
* chore: consolidate eval_strat, loraplus, lr sched, max_length
* fix: deprecate old types
* fix: adding missing Any
* fix: max_steps incorrectly set
* fix: remove unnecessary datacollator kwarg insert and pop
* fix: update default max_steps
* fix: add missing weight_decay handling
* fix: ignore max_length for grpo
* feat: update CI on trainer_builder
* fix: comments
* improve handling of warmup/logging steps
* use transformers default for logging steps, not None
* fix: remove redundant override
* fix: lint
* feat: allow custom optim for rl methods
* fix: duplicate optim setting
* fix(test): set sequence_parallel_degree default in base cfg
* feat: add handling for seed and SP/ring-attn config
* chore: add back return typing from rebase
* fix(test): use RLType directly to skip needing to validate
* feat: split training builder into sub modules
* fix: remove deprecated clause
* chore: add missing config to doc
* fix: update quarto autodoc
* fix: import path for trainer builder and submodules
* fix: remove redundant configs from rebase mistake
* chore: simplify dynamo check
* fix: optimizer_cls_and_kwargs to be passed into trainer_kwargs
* fix: add missing rex from rebase
* fix: move pop optimizer_cls_and_kwargs
* fix: pop optimizer cls in rl too
* fix: leftover bug from rebase
* fix: update handling of trainer_cls in RL
* fix: address pr feedback
* feat: call hook_pre_create_trainer for rl
* chore: lint
* fix: return notimplemented for ppo
* feat: moved torch compile to base and refactor collator setting
* chore: remove unused importlib.util import
* fix: optimizer cls not being popped
* feat: move epoch setting to base
* fix: catch unhandled custom optimizer
* fix: remove duplicate lora plus setting
* chore: refactor if condition
* chore: refactor set_base_training_args into smaller modules
* fix: address TrainerBuilderBase class variables to instance var
* fix: add handling for beta3 and episilon2
* fix: change to pass dict via arg instead of updating dict
* chore: simplify if condition
* fix: force access to lr & weight decay in case not provided to early error
* fix: remove log sweep
* chore: refactor if condition
* fix: address renamed cfg
* fix: improve handling of cosine hyp
* fix: remove unused params
* chore: refactor
* chore: clarify doc safetensors
* fix: update import path to be unified following comments
* fix: duplicate kwargs passed
* feat: return separate trainer_kwargs
* chore: refactor
* chore: refactor based on comments
* chore: refactor based on comments
* fix: move gpustats callback to base
* chore: create trainer_cls_args first based on comments
* fix: ipo label smoothing passed incorrectly
* feat: add optimizer parity for RL methods with test
* feat: add parity for optimizer in RM/PRM and add test
* fix: remove redundant function override for orpo/cpo batch metrics
* fix: improve handling of dpo_label_smoothing and merge issue
* fix: test fixture returning wrong field
* fix: address avoid direct modify fixture
* chore: minor refactor
* Revert "chore: refactor"
This reverts commit 99c8859eb0.
* feat: rename trainer_builder to builders
---------
Co-authored-by: Wing Lian <wing@axolotl.ai>
* feat(doc): add info on how to use dapo / dr grpo
* chore: add missing config to docs
* fix: missing comment
* fix: add missing scheduler from schema
* chore: refactor lr scheduler docs
* fix: remove log_sweep
* add two checks to handle legacy format interleaved ds
* fix: add warning about multiple image using legacy format
---------
Co-authored-by: NanoCode012 <nano@axolotl.ai>
* don't set peft_config on grpo to prevent double peft wrap
* remove overrides needed to support bug
* fix grpo tests
* require more CPU for multigpu to help with torch compile for vllm
* make setting `adam_beta3` and `adam_epsilon2` work correctly
* update config docs so users know args are specific to CAME optim
---------
Co-authored-by: Wing Lian <wing@axolotl.ai>
* offload activations to disk instead of CPU RAM
* add prefetch
* Disco :dance:
* include offload_disk in e2e test for AC
* document and make sure to cleanup
* fix annotation to match docs
* fix docs build
* address PR feedback