Commit Graph

2136 Commits

Author SHA1 Message Date
NanoCode012
79472241e8 chore: refactor set_base_training_args into smaller modules 2025-05-22 18:39:33 +07:00
NanoCode012
58842ded9c chore: refactor if condition 2025-05-22 18:25:45 +07:00
NanoCode012
2e2f42918d fix: remove duplicate lora plus setting 2025-05-22 18:24:41 +07:00
NanoCode012
e8eb3bfdf3 fix: catch unhandled custom optimizer 2025-05-22 18:14:44 +07:00
NanoCode012
cd31394e70 feat: move epoch setting to base 2025-05-22 18:11:17 +07:00
NanoCode012
66d4319d80 fix: optimizer cls not being popped 2025-05-22 18:07:07 +07:00
NanoCode012
c6e730df64 chore: remove unused importlib.util import 2025-05-22 18:00:00 +07:00
NanoCode012
e55d64f709 feat: moved torch compile to base and refactor collator setting 2025-05-22 17:56:58 +07:00
NanoCode012
0fc6499461 fix: return notimplemented for ppo 2025-05-22 17:48:08 +07:00
NanoCode012
24b61c1b67 chore: lint 2025-05-22 17:44:30 +07:00
NanoCode012
b87850e11b feat: call hook_pre_create_trainer for rl 2025-05-22 17:42:46 +07:00
NanoCode012
49888eccb9 fix: address pr feedback 2025-05-16 14:36:38 +07:00
NanoCode012
00bfdb6b2b fix: update handling of trainer_cls in RL 2025-05-16 14:23:28 +07:00
NanoCode012
0b40f2aaf6 fix: leftover bug from rebase 2025-05-16 14:13:50 +07:00
NanoCode012
5c40896d19 fix: pop optimizer cls in rl too 2025-05-16 13:39:41 +07:00
NanoCode012
336c5f9db9 fix: move pop optimizer_cls_and_kwargs 2025-05-16 13:38:11 +07:00
NanoCode012
ad229ffa91 fix: add missing rex from rebase 2025-05-16 13:36:11 +07:00
NanoCode012
7898f44e9b fix: optimizer_cls_and_kwargs to be passed into trainer_kwargs 2025-05-15 20:42:12 +07:00
NanoCode012
7fd05c19f7 chore: simplify dynamo check 2025-05-15 20:28:13 +07:00
NanoCode012
64a57ebb62 fix: remove redundant configs from rebase mistake 2025-05-15 16:07:37 +07:00
NanoCode012
555190868a fix: import path for trainer builder and submodules 2025-05-15 15:49:37 +07:00
NanoCode012
a1832953c4 fix: update quarto autodoc 2025-05-15 15:39:47 +07:00
NanoCode012
930472b7c7 chore: add missing config to doc 2025-05-14 16:56:34 +07:00
NanoCode012
8a336a2c33 fix: remove deprecated clause 2025-05-14 16:55:34 +07:00
NanoCode012
316b450a87 feat: split training builder into sub modules 2025-05-14 16:53:50 +07:00
NanoCode012
c281c6e519 fix(test): use RLType directly to skip needing to validate 2025-05-14 16:17:34 +07:00
NanoCode012
06fae0d34e chore: add back return typing from rebase 2025-05-14 10:39:46 +07:00
NanoCode012
67b1df21aa feat: add handling for seed and SP/ring-attn config 2025-05-14 09:49:46 +07:00
NanoCode012
9af4bffd5d fix(test): set sequence_parallel_degree default in base cfg 2025-05-14 09:36:20 +07:00
NanoCode012
7c91cbddd3 fix: duplicate optim setting 2025-05-14 09:36:20 +07:00
NanoCode012
427e612d5a feat: allow custom optim for rl methods 2025-05-14 09:36:20 +07:00
NanoCode012
b8025b34b9 fix: lint 2025-05-14 09:33:49 +07:00
NanoCode012
51c2adf3b1 fix: remove redundant override 2025-05-14 09:33:49 +07:00
Wing Lian
cbcb7b081b use transformers default for logging steps, not None 2025-05-14 09:33:49 +07:00
Wing Lian
675561e745 improve handling of warmup/logging steps 2025-05-14 09:33:49 +07:00
NanoCode012
a6ce7d7522 fix: comments 2025-05-14 09:33:49 +07:00
NanoCode012
1ea6ce73ed feat: update CI on trainer_builder 2025-05-14 09:33:49 +07:00
NanoCode012
8aa722a140 fix: ignore max_length for grpo 2025-05-14 09:33:49 +07:00
NanoCode012
edaec9fe98 fix: add missing weight_decay handling 2025-05-14 09:33:28 +07:00
NanoCode012
8b6db0c72d fix: update default max_steps 2025-05-14 09:33:28 +07:00
NanoCode012
43f5373c79 fix: remove unnecessary datacollator kwarg insert and pop 2025-05-14 09:33:28 +07:00
NanoCode012
698268bc63 fix: max_steps incorrectly set 2025-05-14 09:33:28 +07:00
NanoCode012
9028eb2758 fix: adding missing Any 2025-05-14 09:33:28 +07:00
NanoCode012
077a54d2b1 fix: deprecate old types 2025-05-14 09:33:28 +07:00
NanoCode012
053e5fd7d1 chore: consolidate eval_strat, loraplus, lr sched, max_length 2025-05-14 09:33:28 +07:00
NanoCode012
fd271b2547 fix: consolidate handling of fp16, bf16, tf32 kwarg 2025-05-14 09:33:28 +07:00
NanoCode012
c268a0157a feat: add report_to to set run name 2025-05-14 09:33:28 +07:00
NanoCode012
6317945b67 fix: refactor sft and rl trainer to set same base args 2025-05-14 09:32:46 +07:00
NanoCode012
86ba574698 feat: add num_proc and load from cache for rl mapping 2025-05-14 09:32:04 +07:00
Wing Lian
7fa1089cea Atropos support (#2666) [skip ci]
* allow peft+liger+grpo and custom vllm serve for atropos support

* set trainer class for RL
2025-05-13 08:30:58 -04:00