Phi2 rewrite (#1058)
* restore to current phi modeling code from phi-2 * enable gradient checkpointing * don't cast everything to float32 all the time * gradient checkpointing for phi2 ParallelBlock module too * fix enabling flash attn for phi2 * add comment about import * fix phi2 example * fix model type check for tokenizer * revert float32 -> bf16 casting changes * support fused dense flash attn * fix the repo for flash-attn * add package name for subdir pkg * fix the data collator when not using sample packing * install packaging for pytests in ci * also fix setup to not install flash attn fused dense subdir if not extras * split out the fused-dense-lib in extra requires * don't train w group_by_length for phi * update integration test to use phi2 * set max steps and save steps for phi e2e tests * try to workaround ssave issue in ci * skip phi2 e2e test for now
This commit is contained in:
4
setup.py
4
setup.py
@@ -17,6 +17,7 @@ def parse_requirements():
|
||||
_dependency_links.append(url)
|
||||
elif (
|
||||
"flash-attn" not in line
|
||||
and "flash-attention" not in line
|
||||
and "deepspeed" not in line
|
||||
and line
|
||||
and line[0] != "#"
|
||||
@@ -51,6 +52,9 @@ setup(
|
||||
"flash-attn": [
|
||||
"flash-attn==2.3.3",
|
||||
],
|
||||
"fused-dense-lib": [
|
||||
"fused-dense-lib @ git+https://github.com/Dao-AILab/flash-attention@v2.3.3#subdirectory=csrc/fused_dense_lib",
|
||||
],
|
||||
"deepspeed": [
|
||||
"deepspeed",
|
||||
],
|
||||
|
||||
Reference in New Issue
Block a user