Commit Graph

1172 Commits

Author SHA1 Message Date
NanoCode012
a65348dce4 fix: improve handling of cosine hyp 2025-05-23 16:26:31 +07:00
NanoCode012
29dbf98ed2 chore: refactor if condition 2025-05-23 12:47:13 +07:00
NanoCode012
36234dc746 Merge branch 'main' into fix/orpo_feature_parity 2025-05-23 12:42:10 +07:00
NanoCode012
2c5659020f fix: remove log sweep 2025-05-23 12:35:09 +07:00
NanoCode012
7496db524b fix: force access to lr & weight decay in case not provided to early error 2025-05-23 12:31:02 +07:00
NanoCode012
255acd3da2 chore: simplify if condition 2025-05-23 12:28:34 +07:00
Dan Saunders
5f8f817200 SP context manager update (#2699)
* utilize accelerate prepare_data_loader with patching

* lint

* cleanup, fix

* update to support DPO quirk

* coderabbit commits, cleanup, remove dead code

* fix

* move ring attn patching to sp ctx manager

* lint

* lint

* test fix

* test fix
2025-05-22 11:18:32 -04:00
NanoCode012
aa0492c366 feat: do not find turn indices if turn is not trainable (#2696)
* feat: do not find turn indices if turn is not trainable

* fix: handle edge case where train on eos/eot is all

* fix: improve warning message
2025-05-22 19:19:59 +07:00
NanoCode012
798b5f5cfd fix(RL): address plugin rl overwriting trainer_cls (#2697) [skip ci]
* fix: plugin rl overwrite trainer_cls

* feat(test): add test to catch trainer_cls is not None
2025-05-22 19:19:12 +07:00
NanoCode012
152d0b67d2 Merge branch 'main' into fix/orpo_feature_parity 2025-05-22 19:11:45 +07:00
NanoCode012
8010376db9 fix: change to pass dict via arg instead of updating dict 2025-05-22 18:53:21 +07:00
NanoCode012
bc53e80387 fix: add handling for beta3 and episilon2 2025-05-22 18:48:36 +07:00
NanoCode012
5346b99a88 fix: address TrainerBuilderBase class variables to instance var 2025-05-22 18:46:03 +07:00
NanoCode012
79472241e8 chore: refactor set_base_training_args into smaller modules 2025-05-22 18:39:33 +07:00
NanoCode012
58842ded9c chore: refactor if condition 2025-05-22 18:25:45 +07:00
NanoCode012
2e2f42918d fix: remove duplicate lora plus setting 2025-05-22 18:24:41 +07:00
NanoCode012
e8eb3bfdf3 fix: catch unhandled custom optimizer 2025-05-22 18:14:44 +07:00
NanoCode012
cd31394e70 feat: move epoch setting to base 2025-05-22 18:11:17 +07:00
NanoCode012
66d4319d80 fix: optimizer cls not being popped 2025-05-22 18:07:07 +07:00
NanoCode012
c6e730df64 chore: remove unused importlib.util import 2025-05-22 18:00:00 +07:00
NanoCode012
e55d64f709 feat: moved torch compile to base and refactor collator setting 2025-05-22 17:56:58 +07:00
NanoCode012
0fc6499461 fix: return notimplemented for ppo 2025-05-22 17:48:08 +07:00
NanoCode012
24b61c1b67 chore: lint 2025-05-22 17:44:30 +07:00
NanoCode012
b87850e11b feat: call hook_pre_create_trainer for rl 2025-05-22 17:42:46 +07:00
Dan Saunders
6aa41740df SP dataloader patching + removing custom sampler / dataloader logic (#2686)
* utilize accelerate prepare_data_loader with patching

* lint

* cleanup, fix

* update to support DPO quirk

* small change

* coderabbit commits, cleanup, remove dead code

* quarto fix

* patch fix

* review comments

* moving monkeypatch up one level

* fix
2025-05-21 11:20:20 -04:00
Wing Lian
a27b909c5c GRPO fixes (peft) (#2676)
* don't set peft_config on grpo to prevent double peft wrap

* remove overrides needed to support bug

* fix grpo tests

* require more CPU for multigpu to help with torch compile for vllm
2025-05-16 15:47:03 -04:00
xzuyn
6cb07b9d12 Fix for setting adam_beta3 and adam_epsilon2 for CAME Optimizer (#2654) [skip ci]
* make setting `adam_beta3` and `adam_epsilon2` work correctly

* update config docs so users know args are specific to CAME optim

---------

Co-authored-by: Wing Lian <wing@axolotl.ai>
2025-05-16 15:46:50 -04:00
C080
288653adb6 Fix: Make MLflow config artifact logging respect hf_mlflow_log_artifa… (#2675) [skip ci]
* Fix: Make MLflow config artifact logging respect hf_mlflow_log_artifacts setting

* cleanup and lint

---------

Co-authored-by: Wing Lian <wing@axolotl.ai>
2025-05-16 15:46:31 -04:00
xzuyn
f661858fc4 Print dataset name (#2668) [skip ci] 2025-05-16 13:06:58 -04:00
Eric Meier
c837c4a424 Add missing init file to liger plugin (#2670) [skip ci] 2025-05-16 13:06:46 -04:00
michelyang
c9797de6bb Add num_proc to fix data set slow processing issue (#2681) [skip ci] 2025-05-16 13:06:20 -04:00
NanoCode012
86472715da fix: remove doc string imports in monkeypatches (#2671) [skip ci] 2025-05-16 13:05:55 -04:00
NanoCode012
49888eccb9 fix: address pr feedback 2025-05-16 14:36:38 +07:00
NanoCode012
00bfdb6b2b fix: update handling of trainer_cls in RL 2025-05-16 14:23:28 +07:00
NanoCode012
0b40f2aaf6 fix: leftover bug from rebase 2025-05-16 14:13:50 +07:00
NanoCode012
5c40896d19 fix: pop optimizer cls in rl too 2025-05-16 13:39:41 +07:00
NanoCode012
336c5f9db9 fix: move pop optimizer_cls_and_kwargs 2025-05-16 13:38:11 +07:00
NanoCode012
ad229ffa91 fix: add missing rex from rebase 2025-05-16 13:36:11 +07:00
NanoCode012
7898f44e9b fix: optimizer_cls_and_kwargs to be passed into trainer_kwargs 2025-05-15 20:42:12 +07:00
NanoCode012
7fd05c19f7 chore: simplify dynamo check 2025-05-15 20:28:13 +07:00
NanoCode012
64a57ebb62 fix: remove redundant configs from rebase mistake 2025-05-15 16:07:37 +07:00
NanoCode012
555190868a fix: import path for trainer builder and submodules 2025-05-15 15:49:37 +07:00
NanoCode012
8a336a2c33 fix: remove deprecated clause 2025-05-14 16:55:34 +07:00
NanoCode012
316b450a87 feat: split training builder into sub modules 2025-05-14 16:53:50 +07:00
NanoCode012
06fae0d34e chore: add back return typing from rebase 2025-05-14 10:39:46 +07:00
NanoCode012
67b1df21aa feat: add handling for seed and SP/ring-attn config 2025-05-14 09:49:46 +07:00
NanoCode012
7c91cbddd3 fix: duplicate optim setting 2025-05-14 09:36:20 +07:00
NanoCode012
427e612d5a feat: allow custom optim for rl methods 2025-05-14 09:36:20 +07:00
NanoCode012
b8025b34b9 fix: lint 2025-05-14 09:33:49 +07:00
NanoCode012
51c2adf3b1 fix: remove redundant override 2025-05-14 09:33:49 +07:00