Mixtral multipack (#928)
* mixtral multipack * use mixtral model * sample yml * calculate cu_seqlens properly * use updated flash ettention setting * attn var checks * force use of flash attention 2 for packing * lint * disable future fix for now * update support table
This commit is contained in:
Reference in New Issue
Block a user