Files
axolotl/examples
miketung 33975ce4bc feat(qwen3-next): Adds targeting of shared expert and attention modules (#3183)
* Adds targetting of shared expert and attention modules in each layer

* Update VRAM usage

---------

Co-authored-by: Mike Tung <mike@diffbot.com>
2025-09-25 17:06:16 +07:00
..
2025-09-10 09:03:30 +07:00
2025-09-10 09:03:30 +07:00
2025-08-15 10:52:57 -04:00
2025-09-03 16:22:37 -04:00
2025-08-15 10:52:57 -04:00
2025-09-02 12:08:44 -04:00