Llama4 linearized (#2502)
* llama4 support for linearized experts * clean up fsdp2 sharding to prevent hang * add yaml config * cleanup example [skip ci]
This commit is contained in:
@@ -4,3 +4,5 @@ mypy
|
||||
types-requests
|
||||
quartodoc
|
||||
jupyter
|
||||
blobfile
|
||||
tiktoken
|
||||
|
||||
Reference in New Issue
Block a user