axolotl

Author	SHA1	Message	Date
NanoCode012	2e2f42918d	fix: remove duplicate lora plus setting	2025-05-22 18:24:41 +07:00
NanoCode012	e8eb3bfdf3	fix: catch unhandled custom optimizer	2025-05-22 18:14:44 +07:00
NanoCode012	cd31394e70	feat: move epoch setting to base	2025-05-22 18:11:17 +07:00
NanoCode012	66d4319d80	fix: optimizer cls not being popped	2025-05-22 18:07:07 +07:00
NanoCode012	c6e730df64	chore: remove unused importlib.util import	2025-05-22 18:00:00 +07:00
NanoCode012	e55d64f709	feat: moved torch compile to base and refactor collator setting	2025-05-22 17:56:58 +07:00
NanoCode012	0fc6499461	fix: return notimplemented for ppo	2025-05-22 17:48:08 +07:00
NanoCode012	24b61c1b67	chore: lint	2025-05-22 17:44:30 +07:00
NanoCode012	b87850e11b	feat: call hook_pre_create_trainer for rl	2025-05-22 17:42:46 +07:00
NanoCode012	49888eccb9	fix: address pr feedback	2025-05-16 14:36:38 +07:00
NanoCode012	00bfdb6b2b	fix: update handling of trainer_cls in RL	2025-05-16 14:23:28 +07:00
NanoCode012	0b40f2aaf6	fix: leftover bug from rebase	2025-05-16 14:13:50 +07:00
NanoCode012	5c40896d19	fix: pop optimizer cls in rl too	2025-05-16 13:39:41 +07:00
NanoCode012	336c5f9db9	fix: move pop optimizer_cls_and_kwargs	2025-05-16 13:38:11 +07:00
NanoCode012	ad229ffa91	fix: add missing rex from rebase	2025-05-16 13:36:11 +07:00
NanoCode012	7898f44e9b	fix: optimizer_cls_and_kwargs to be passed into trainer_kwargs	2025-05-15 20:42:12 +07:00
NanoCode012	7fd05c19f7	chore: simplify dynamo check	2025-05-15 20:28:13 +07:00
NanoCode012	64a57ebb62	fix: remove redundant configs from rebase mistake	2025-05-15 16:07:37 +07:00
NanoCode012	555190868a	fix: import path for trainer builder and submodules	2025-05-15 15:49:37 +07:00
NanoCode012	a1832953c4	fix: update quarto autodoc	2025-05-15 15:39:47 +07:00
NanoCode012	930472b7c7	chore: add missing config to doc	2025-05-14 16:56:34 +07:00
NanoCode012	8a336a2c33	fix: remove deprecated clause	2025-05-14 16:55:34 +07:00
NanoCode012	316b450a87	feat: split training builder into sub modules	2025-05-14 16:53:50 +07:00
NanoCode012	c281c6e519	fix(test): use RLType directly to skip needing to validate	2025-05-14 16:17:34 +07:00
NanoCode012	06fae0d34e	chore: add back return typing from rebase	2025-05-14 10:39:46 +07:00
NanoCode012	67b1df21aa	feat: add handling for seed and SP/ring-attn config	2025-05-14 09:49:46 +07:00
NanoCode012	9af4bffd5d	fix(test): set sequence_parallel_degree default in base cfg	2025-05-14 09:36:20 +07:00
NanoCode012	7c91cbddd3	fix: duplicate optim setting	2025-05-14 09:36:20 +07:00
NanoCode012	427e612d5a	feat: allow custom optim for rl methods	2025-05-14 09:36:20 +07:00
NanoCode012	b8025b34b9	fix: lint	2025-05-14 09:33:49 +07:00
NanoCode012	51c2adf3b1	fix: remove redundant override	2025-05-14 09:33:49 +07:00
Wing Lian	cbcb7b081b	use transformers default for logging steps, not None	2025-05-14 09:33:49 +07:00
Wing Lian	675561e745	improve handling of warmup/logging steps	2025-05-14 09:33:49 +07:00
NanoCode012	a6ce7d7522	fix: comments	2025-05-14 09:33:49 +07:00
NanoCode012	1ea6ce73ed	feat: update CI on trainer_builder	2025-05-14 09:33:49 +07:00
NanoCode012	8aa722a140	fix: ignore max_length for grpo	2025-05-14 09:33:49 +07:00
NanoCode012	edaec9fe98	fix: add missing weight_decay handling	2025-05-14 09:33:28 +07:00
NanoCode012	8b6db0c72d	fix: update default max_steps	2025-05-14 09:33:28 +07:00
NanoCode012	43f5373c79	fix: remove unnecessary datacollator kwarg insert and pop	2025-05-14 09:33:28 +07:00
NanoCode012	698268bc63	fix: max_steps incorrectly set	2025-05-14 09:33:28 +07:00
NanoCode012	9028eb2758	fix: adding missing Any	2025-05-14 09:33:28 +07:00
NanoCode012	077a54d2b1	fix: deprecate old types	2025-05-14 09:33:28 +07:00
NanoCode012	053e5fd7d1	chore: consolidate eval_strat, loraplus, lr sched, max_length	2025-05-14 09:33:28 +07:00
NanoCode012	fd271b2547	fix: consolidate handling of fp16, bf16, tf32 kwarg	2025-05-14 09:33:28 +07:00
NanoCode012	c268a0157a	feat: add report_to to set run name	2025-05-14 09:33:28 +07:00
NanoCode012	6317945b67	fix: refactor sft and rl trainer to set same base args	2025-05-14 09:32:46 +07:00
NanoCode012	86ba574698	feat: add num_proc and load from cache for rl mapping	2025-05-14 09:32:04 +07:00
Wing Lian	7fa1089cea	Atropos support (#2666 ) [skip ci] * allow peft+liger+grpo and custom vllm serve for atropos support * set trainer class for RL	2025-05-13 08:30:58 -04:00
Dan Saunders	80304c26a7	SP GRPO support + batch SP fixes (#2643 ) * ctx manager for SP * updates * update * further simplifying * simplifying * simplifying * reorg * batch api HF adapter for ring-flash-attn; cleanup and improvements * update * adding all batch ring-flash-attn methods via single adapter * fix * fixes for batch API funcs, simplify * fix * grpo sp support * progress * stronger subclassing of TRL GRPO trainer; custom distributed sampler * subclassing constructor * progress * finalizing SP + GRPO trainer * minimize diffs to GRPO trainer * remove (most of) the custom GRPO trainer logic * debug * debug * update * update * update * progress * cleanup * cleanup * minor changes * update * update * update * small changes * updates * cleanup; torch.compile ring_flash_attn functions to prevent numerical instability; lint * spacing * cleanup; log in pydantic model config only on main process * remove comment * fix sp sampler, update to latest upstream code, doc * add docs * update quartodoc autodoc contents * fix, simplifications * fixes + simplifications * review comments * lint * removing main process only logs in favor of #2608 * fixes, additional smoke test * updatse * more tests * update * fix grad accum bug (sort of) * lint, tests * todo	2025-05-12 17:52:40 -04:00
NanoCode012	67c4ea9c7c	fix: disable auto lora kernel if dropout nonzero (#2655 ) [skip ci] * fix: disable auto lora kernel if dropout nonzero * Add comment from PR feedback --------- Co-authored-by: Wing Lian <wing@axolotl.ai>	2025-05-12 16:23:53 -04:00

1 2 3 4 5 ...

2134 Commits