Merge branch 'main' into 775-option-to-drop-vs-truncate-on-rows-longer-than-context-length
This commit is contained in:
@@ -635,7 +635,9 @@ weight_decay:
|
||||
# adamw hyperparams
|
||||
adam_beta1:
|
||||
adam_beta2:
|
||||
adam_beta3: # only used for CAME Optimizer
|
||||
adam_epsilon:
|
||||
adam_epsilon2: # only used for CAME Optimizer
|
||||
# Gradient clipping max norm
|
||||
max_grad_norm:
|
||||
|
||||
|
||||
Reference in New Issue
Block a user