* checkpoint model on first step callback
* remove debug
* add test cases; update existing tests not to save on first step
* move test out of solo
* delete
* default to False
* typo
* make gemma3 work with packing
* multi-gpu e2e for ci
* update gemma3 model namespace to use mirror
* add gradient checkpointing to multigpu e2e ci
* update gemma3 examples for use_reentrant and fix ddp find unused params
* fix tests for gemma3
* fix import for test utils
* set correct train loss for gemma3 e2e