* Adds targetting of shared expert and attention modules in each layer * Update VRAM usage --------- Co-authored-by: Mike Tung <mike@diffbot.com>
1.3 KiB
1.3 KiB
* Adds targetting of shared expert and attention modules in each layer * Update VRAM usage --------- Co-authored-by: Mike Tung <mike@diffbot.com>