NanoCode012
a65348dce4
fix: improve handling of cosine hyp
2025-05-23 16:26:31 +07:00
NanoCode012
29dbf98ed2
chore: refactor if condition
2025-05-23 12:47:13 +07:00
NanoCode012
36234dc746
Merge branch 'main' into fix/orpo_feature_parity
2025-05-23 12:42:10 +07:00
NanoCode012
2c5659020f
fix: remove log sweep
2025-05-23 12:35:09 +07:00
NanoCode012
7496db524b
fix: force access to lr & weight decay in case not provided to early error
2025-05-23 12:31:02 +07:00
NanoCode012
255acd3da2
chore: simplify if condition
2025-05-23 12:28:34 +07:00
Dan Saunders
5f8f817200
SP context manager update ( #2699 )
...
* utilize accelerate prepare_data_loader with patching
* lint
* cleanup, fix
* update to support DPO quirk
* coderabbit commits, cleanup, remove dead code
* fix
* move ring attn patching to sp ctx manager
* lint
* lint
* test fix
* test fix
2025-05-22 11:18:32 -04:00
NanoCode012
aa0492c366
feat: do not find turn indices if turn is not trainable ( #2696 )
...
* feat: do not find turn indices if turn is not trainable
* fix: handle edge case where train on eos/eot is all
* fix: improve warning message
2025-05-22 19:19:59 +07:00
NanoCode012
798b5f5cfd
fix(RL): address plugin rl overwriting trainer_cls ( #2697 ) [skip ci]
...
* fix: plugin rl overwrite trainer_cls
* feat(test): add test to catch trainer_cls is not None
2025-05-22 19:19:12 +07:00
NanoCode012
152d0b67d2
Merge branch 'main' into fix/orpo_feature_parity
2025-05-22 19:11:45 +07:00
NanoCode012
8010376db9
fix: change to pass dict via arg instead of updating dict
2025-05-22 18:53:21 +07:00
NanoCode012
bc53e80387
fix: add handling for beta3 and episilon2
2025-05-22 18:48:36 +07:00
NanoCode012
5346b99a88
fix: address TrainerBuilderBase class variables to instance var
2025-05-22 18:46:03 +07:00
NanoCode012
79472241e8
chore: refactor set_base_training_args into smaller modules
2025-05-22 18:39:33 +07:00
NanoCode012
58842ded9c
chore: refactor if condition
2025-05-22 18:25:45 +07:00
NanoCode012
2e2f42918d
fix: remove duplicate lora plus setting
2025-05-22 18:24:41 +07:00
NanoCode012
e8eb3bfdf3
fix: catch unhandled custom optimizer
2025-05-22 18:14:44 +07:00
NanoCode012
cd31394e70
feat: move epoch setting to base
2025-05-22 18:11:17 +07:00
NanoCode012
66d4319d80
fix: optimizer cls not being popped
2025-05-22 18:07:07 +07:00
NanoCode012
c6e730df64
chore: remove unused importlib.util import
2025-05-22 18:00:00 +07:00
NanoCode012
e55d64f709
feat: moved torch compile to base and refactor collator setting
2025-05-22 17:56:58 +07:00
NanoCode012
0fc6499461
fix: return notimplemented for ppo
2025-05-22 17:48:08 +07:00
NanoCode012
24b61c1b67
chore: lint
2025-05-22 17:44:30 +07:00
NanoCode012
b87850e11b
feat: call hook_pre_create_trainer for rl
2025-05-22 17:42:46 +07:00
Dan Saunders
6aa41740df
SP dataloader patching + removing custom sampler / dataloader logic ( #2686 )
...
* utilize accelerate prepare_data_loader with patching
* lint
* cleanup, fix
* update to support DPO quirk
* small change
* coderabbit commits, cleanup, remove dead code
* quarto fix
* patch fix
* review comments
* moving monkeypatch up one level
* fix
2025-05-21 11:20:20 -04:00
Wing Lian
a27b909c5c
GRPO fixes (peft) ( #2676 )
...
* don't set peft_config on grpo to prevent double peft wrap
* remove overrides needed to support bug
* fix grpo tests
* require more CPU for multigpu to help with torch compile for vllm
2025-05-16 15:47:03 -04:00
xzuyn
6cb07b9d12
Fix for setting adam_beta3 and adam_epsilon2 for CAME Optimizer ( #2654 ) [skip ci]
...
* make setting `adam_beta3` and `adam_epsilon2` work correctly
* update config docs so users know args are specific to CAME optim
---------
Co-authored-by: Wing Lian <wing@axolotl.ai >
2025-05-16 15:46:50 -04:00
C080
288653adb6
Fix: Make MLflow config artifact logging respect hf_mlflow_log_artifa… ( #2675 ) [skip ci]
...
* Fix: Make MLflow config artifact logging respect hf_mlflow_log_artifacts setting
* cleanup and lint
---------
Co-authored-by: Wing Lian <wing@axolotl.ai >
2025-05-16 15:46:31 -04:00
xzuyn
f661858fc4
Print dataset name ( #2668 ) [skip ci]
2025-05-16 13:06:58 -04:00
Eric Meier
c837c4a424
Add missing init file to liger plugin ( #2670 ) [skip ci]
2025-05-16 13:06:46 -04:00
michelyang
c9797de6bb
Add num_proc to fix data set slow processing issue ( #2681 ) [skip ci]
2025-05-16 13:06:20 -04:00
NanoCode012
86472715da
fix: remove doc string imports in monkeypatches ( #2671 ) [skip ci]
2025-05-16 13:05:55 -04:00
NanoCode012
49888eccb9
fix: address pr feedback
2025-05-16 14:36:38 +07:00
NanoCode012
00bfdb6b2b
fix: update handling of trainer_cls in RL
2025-05-16 14:23:28 +07:00
NanoCode012
0b40f2aaf6
fix: leftover bug from rebase
2025-05-16 14:13:50 +07:00
NanoCode012
5c40896d19
fix: pop optimizer cls in rl too
2025-05-16 13:39:41 +07:00
NanoCode012
336c5f9db9
fix: move pop optimizer_cls_and_kwargs
2025-05-16 13:38:11 +07:00
NanoCode012
ad229ffa91
fix: add missing rex from rebase
2025-05-16 13:36:11 +07:00
NanoCode012
7898f44e9b
fix: optimizer_cls_and_kwargs to be passed into trainer_kwargs
2025-05-15 20:42:12 +07:00
NanoCode012
7fd05c19f7
chore: simplify dynamo check
2025-05-15 20:28:13 +07:00
NanoCode012
64a57ebb62
fix: remove redundant configs from rebase mistake
2025-05-15 16:07:37 +07:00
NanoCode012
555190868a
fix: import path for trainer builder and submodules
2025-05-15 15:49:37 +07:00
NanoCode012
8a336a2c33
fix: remove deprecated clause
2025-05-14 16:55:34 +07:00
NanoCode012
316b450a87
feat: split training builder into sub modules
2025-05-14 16:53:50 +07:00
NanoCode012
06fae0d34e
chore: add back return typing from rebase
2025-05-14 10:39:46 +07:00
NanoCode012
67b1df21aa
feat: add handling for seed and SP/ring-attn config
2025-05-14 09:49:46 +07:00
NanoCode012
7c91cbddd3
fix: duplicate optim setting
2025-05-14 09:36:20 +07:00
NanoCode012
427e612d5a
feat: allow custom optim for rl methods
2025-05-14 09:36:20 +07:00
NanoCode012
b8025b34b9
fix: lint
2025-05-14 09:33:49 +07:00
NanoCode012
51c2adf3b1
fix: remove redundant override
2025-05-14 09:33:49 +07:00