* fix: force train split for json,csv,txt for test_datasets
* feat(doc): add info on mixing datasets for VLM
* feat(doc): max memory
* fix(doc): clarify lr groups
* fix: add info on vision not being dropped
* feat: add qwen3-vl to multimodal docs
* fix: add moe blocks to arch list
* feat(doc): improve mistral docs
* chore: add helpful link [skip-e2e]
* fix: add vram usage for mistral small
* Update link in docs/faq.qmd
Co-authored-by: salman <salman.mohammadi@outlook.com>
---------
Co-authored-by: Wing Lian <wing@axolotl.ai>
Co-authored-by: salman <salman.mohammadi@outlook.com>
* feat: add hunyuan cce support
* feat: update cce docs
* feat: add multipack support for granite and hunyuan
* feat: add hunyuan docs and example config
* feat: update readme instructions to include CCE installation
* fix: chat template log appearing despite tokenizer already having template
* feat: add vram usage
* fix: remove duplicate cce install
* fix: use latest commit of PR in case rebased/pushed
* Revert "fix: use latest commit of PR in case rebased/pushed"
This reverts commit 8b60aa00de.
* feat: update doc as upstream merged
* feat: add center_rewards_coefficient for reward modeling
- Add center_rewards_coefficient parameter to Pydantic schema with paper reference
- Pass parameter through base builder and causal builder to training args
- Add documentation section with usage examples and theoretical background
- Enable parameter in reward modeling example configs with recommended value
- Enables reward centering for improved training stability in RLHF workflows
Implements auxiliary loss from Eisenstein et al. 2023 (https://huggingface.co/papers/2312.09244)
to incentivize mean-zero reward outputs without post-training normalization.
* Update description
* test: add unit tests for center_rewards_coefficient integration
* Update src/axolotl/core/builders/base.py
Co-authored-by: NanoCode012 <kevinvong@rocketmail.com>
* Update docs/reward_modelling.qmd
Co-authored-by: NanoCode012 <kevinvong@rocketmail.com>
* Update docs/reward_modelling.qmd
Co-authored-by: NanoCode012 <kevinvong@rocketmail.com>
* reference to TRL documentation.
* add new reward model configuration for qwen3 with comprehensive parameters
* Verified center_rewards_coefficient is correctly passed through the trainer builder to training arguments.
* Refactor reward modeling documentation to consolidate information on center_rewards_coefficient
* Remove unit tests for center_rewards_coefficient integration as part of codebase cleanup.
* linting
* nit
* Apply suggestions from code review
Co-authored-by: NanoCode012 <kevinvong@rocketmail.com>
* lint
---------
Co-authored-by: NanoCode012 <kevinvong@rocketmail.com>
Co-authored-by: Salman Mohammadi <salman.mohammadi@outlook.com>
* improve fsdp shard merging
* improve logging
* update information on merging and inferencing GPT-OSS
* cleanup readme
* automate cleanup of FSDP prefix
* import GRPO only if necessary
* only modify config.json on rank0
* merge final checkpoint at end of training
* prevent circular import
* Fix saving for sharded state dict
* devx, move merged to output dir
* move import back to top
* Fix stuck merge
* fix conditionals from pr feedback and add test
* fix to not use batch feature indexing
* more vlm fixes
* use AutoModelForImageTextToText
* add example yaml and need num2words for chat template
* improve handling of adding image tokens to conversation
* add lfm2-vl support
* update the lfm readme
* fix markdown and add rtol for loss checks
* feat: add smolvlm2 processing strat
* fix: check for causal-conv1d in lfm models
* feat: add docs for lfm2
* feat: add new models and tips to docs
* feat: add smolvlm2 docs and remove extra dep
* chore: update docs
* feat: add video instructions
* chore: cleanup
* chore: comments
* fix: typo
* feat: add usage stats
* chore: refactor
---------
Co-authored-by: NanoCode012 <nano@axolotl.ai>
* feat(doc): add links to new features on README
* fix merge error
* remove blurb about older FSDP2 integration
* update blog link
* chore: update cce commit
* feat: update model support into readme
* Update README.md
Co-authored-by: salman <salman.mohammadi@outlook.com>
* chore: lint num spaces
---------
Co-authored-by: Wing Lian <wing@axolotl.ai>
Co-authored-by: salman <salman.mohammadi@outlook.com>
* slurm example and make preprocess play nicely
* start slurm if it init file exists
* remove incorrect comment
* feat: add slurm docs
---------
Co-authored-by: NanoCode012 <nano@axolotl.ai>
* fix for parallelism config from trainer
* fix handling of parallelism_config w accelerate
* add todo for removal
* update to latest axolotl-contribs-mit for optimizer fix too
* synchronize training after checkpoint save
* dir spelling
* use latest accelerate main
* fix to not use partial state parallelism_config
* more fixeS
* use most recent accelerate fix
* fix cpu_ram_efficient_loading to meta devices from rank 0 to prevent CPU RAM oom
* improve handling of broadcasting fsdp2 state dict
* support for openai chat template with thinking key as the reasoning trace
* address PR feedback
* refactor to remove dependency on PartialState for parallelism config
* bump accelerate, gptoss fixes
* limit meta fixes to fsdp2 for now
* fixes for gpt oss
* fixup examples, don't use cpu-ram-efficient-loading for now
* remove problematic barrier
* patch parallelism config
* reorder comparison
* device mesh fixes
* make pure CP work
* lint