axolotl

Files

Wing Lian 80a97f192b validate batch shape against num_generations at config time

Surfaces a class of GRPO config errors at axolotl-train startup instead
of letting them bubble out of GRPOTrainer.__init__ after the model loads.
Three checks under RLValidationMixin.check_grpo_batch_size_divisibility:

  - effective generation_batch_size (or mb*GA fallback) must be divisible
    by trl.num_generations, with a hint pointing at the smallest GA bump
    that fixes the violation
  - num_generations >= 2 (group-relative advantage needs variance; with
    num_gen=1 the policy never updates)
  - When world_size > 1, effective gbs >= num_generations * world_size

11 unit tests cover the table: divisible/non-divisible, explicit and
implicit gbs, multi-rank constraint, GRPO-disabled passthrough, and
unset num_generations.

2026-04-15 13:27:30 +00:00

callbacks

Skip redundant evaluation when resuming from checkpoint (#3575 ) [skip ci]

2026-04-12 20:50:15 -04:00

data

feat: support excess_length_strategy for RL trainers (#3578 ) [skip ci]

2026-04-12 20:51:10 -04:00

lora

better handling of dora merge on Conv layers in Qwen 3.5 (#3599 )

2026-04-12 10:57:45 -04:00

schemas/validation

validate batch shape against num_generations at config time