feat(doc): add info on how to use dapo / dr grpo and misc doc fixes (#2673) [skip ci]

* feat(doc): add info on how to use dapo / dr grpo

* chore: add missing config to docs

* fix: missing comment

* fix: add missing scheduler from schema

* chore: refactor lr scheduler docs

* fix: remove log_sweep
This commit is contained in:
NanoCode012
2025-05-28 15:51:04 +07:00
committed by GitHub
parent add2025253
commit 6b6370f4e3
3 changed files with 50 additions and 9 deletions

View File

@@ -242,16 +242,12 @@
# early_stopping_patience: 3
# # Specify a scheduler and kwargs to use with the optimizer
# lr_scheduler: # 'one_cycle' | 'log_sweep' | empty for cosine
# lr_scheduler: # 'one_cycle' | empty for cosine
# lr_scheduler_kwargs:
# # For one_cycle optim
# lr_div_factor: # Learning rate div factor
# # For log_sweep optim
# log_sweep_min_lr:
# log_sweep_max_lr:
# # Specify optimizer
# # Valid values are driven by the Transformers OptimizerNames class, see:
# # https://github.com/huggingface/transformers/blob/95b374952dc27d8511541d6f5a4e22c9ec11fb24/src/transformers/training_args.py#L134