Phi2 rewrite (#1058)

* restore to current phi modeling code from phi-2 * enable gradient checkpointing * don't cast everything to float32 all the time * gradient checkpointing for phi2 ParallelBlock module too * fix enabling flash attn for phi2 * add comment about import * fix phi2 example * fix model type check for tokenizer * revert float32 -> bf16 casting changes * support fused dense flash attn * fix the repo for flash-attn * add package name for subdir pkg * fix the data collator when not using sample packing * install packaging for pytests in ci * also fix setup to not install flash attn fused dense subdir if not extras * split out the fused-dense-lib in extra requires * don't train w group_by_length for phi * update integration test to use phi2 * set max steps and save steps for phi e2e tests * try to workaround ssave issue in ci * skip phi2 e2e test for now
2024-01-08 14:04:22 -05:00
parent 9ca358b671
commit 732851f105
7 changed files with 230 additions and 99 deletions
--- a/setup.py
+++ b/setup.py
@@ -17,6 +17,7 @@ def parse_requirements():
                _dependency_links.append(url)
            elif (
                "flash-attn" not in line
+                and "flash-attention" not in line
                and "deepspeed" not in line
                and line
                and line[0] != "#"
@@ -51,6 +52,9 @@ setup(
        "flash-attn": [
            "flash-attn==2.3.3",
        ],
+        "fused-dense-lib": [
+            "fused-dense-lib  @ git+https://github.com/Dao-AILab/flash-attention@v2.3.3#subdirectory=csrc/fused_dense_lib",
+        ],
        "deepspeed": [
            "deepspeed",
        ],