* support for custom lr groups for non-embedding modules
invert name check for group modules
include lr_groups in training args
additional conditional for creating optimizer
fix regular params as w weight decay
fix lookup and add docs
* address pr feedback