* fix deprecate deepspeed stage3_gather_16bit_weights_on_model_save arg * replace the rest of the migrated deepspeed params
gradient_clipping
auto