- Fix _loss_function attribute not found on base model with PEFT
- Fix mismatched attribute name (loss_function vs _loss_function)
- Set _loss_function on unwrapped base model for PEFT
- Enable previously skipped test_llama_lora_kd test
- Add test config fixes for LoRA kernel compatibility
Fixes https://github.com/axolotl-ai-cloud/axolotl/issues/3206
* make sure to use ray prepare for dataloader fixes
* ray tests use 2.7.0+
* don't call init_distributed w ray and deepspeed
* handle dict deepspeed config
* better handling of dict deepspeed config
* use json.dumps
* guard to_dict
* wrap import for optional ray
* upgrade transformers to 4.57.0
* remove deprecated autoawq and use latest peft
* remove autoawq from setuptools script
* fix imports
* make sure torchvision is installed
* remove support for BetterTransformer
* skip fsdp_qlora_prequant test
* more robust error reporting
* pass max_prompt_len to training args as max_prompt_length
* Update rl.py
* refactor
* format
* fix: default for max_prompt_length
* fix: defaults for trainer
---------
Co-authored-by: NanoCode012 <nano@axolotl.ai>
* feat: add hunyuan cce support
* feat: update cce docs
* feat: add multipack support for granite and hunyuan
* feat: add hunyuan docs and example config
* feat: update readme instructions to include CCE installation
* fix: chat template log appearing despite tokenizer already having template
* feat: add vram usage
* fix: remove duplicate cce install
* fix: use latest commit of PR in case rebased/pushed
* Revert "fix: use latest commit of PR in case rebased/pushed"
This reverts commit 8b60aa00de.
* feat: update doc as upstream merged
* default true
* force e2e
* causal trainer only
* fix eval loggin [skip-ci]
* revert setup.py
* force tests
* guarding
* guarding
* fix test case
* use evaluate [skip-e2e]
* use evaluate [skip-e2e]
* kick off ci
* fixing
* reverting
* feat: upgrade transformers to v4.56
* fix handling of CP/SP now that position_ids are default even for unpacked sequences
* feat: monkeypatch list_repo_templates
* fix: apply patch for tests only
* see if updated main works at least
* fix: update to patch release and remove monkeypatch
* remove fsdp2 eval patch
---------
Co-authored-by: Wing Lian <wing@axolotl.ai>
* feat: add center_rewards_coefficient for reward modeling
- Add center_rewards_coefficient parameter to Pydantic schema with paper reference
- Pass parameter through base builder and causal builder to training args
- Add documentation section with usage examples and theoretical background
- Enable parameter in reward modeling example configs with recommended value
- Enables reward centering for improved training stability in RLHF workflows
Implements auxiliary loss from Eisenstein et al. 2023 (https://huggingface.co/papers/2312.09244)
to incentivize mean-zero reward outputs without post-training normalization.
* Update description
* test: add unit tests for center_rewards_coefficient integration
* Update src/axolotl/core/builders/base.py
Co-authored-by: NanoCode012 <kevinvong@rocketmail.com>
* Update docs/reward_modelling.qmd
Co-authored-by: NanoCode012 <kevinvong@rocketmail.com>
* Update docs/reward_modelling.qmd
Co-authored-by: NanoCode012 <kevinvong@rocketmail.com>
* reference to TRL documentation.
* add new reward model configuration for qwen3 with comprehensive parameters
* Verified center_rewards_coefficient is correctly passed through the trainer builder to training arguments.
* Refactor reward modeling documentation to consolidate information on center_rewards_coefficient
* Remove unit tests for center_rewards_coefficient integration as part of codebase cleanup.
* linting
* nit
* Apply suggestions from code review
Co-authored-by: NanoCode012 <kevinvong@rocketmail.com>
* lint
---------
Co-authored-by: NanoCode012 <kevinvong@rocketmail.com>
Co-authored-by: Salman Mohammadi <salman.mohammadi@outlook.com>