* use warmup_ratio as a better default than warmup steps since it's data dependent * replace remainder of warmup_steps
1.6 KiB
1.6 KiB
* use warmup_ratio as a better default than warmup steps since it's data dependent * replace remainder of warmup_steps