* restore to current phi modeling code from phi-2 * enable gradient checkpointing * don't cast everything to float32 all the time * gradient checkpointing for phi2 ParallelBlock module too * fix enabling flash attn for phi2 * add comment about import * fix phi2 example * fix model type check for tokenizer * revert float32 -> bf16 casting changes * support fused dense flash attn * fix the repo for flash-attn * add package name for subdir pkg * fix the data collator when not using sample packing * install packaging for pytests in ci * also fix setup to not install flash attn fused dense subdir if not extras * split out the fused-dense-lib in extra requires * don't train w group_by_length for phi * update integration test to use phi2 * set max steps and save steps for phi e2e tests * try to workaround ssave issue in ci * skip phi2 e2e test for now
Phi
Due to some nuances with the phi code, please use deepspeed when training phi for full finetune.
accelerate launch -m axolotl.cli.train examples/phi/phi-ft.yml --deepspeed deepspeed/zero1.json
# OR
python -m axolotl.cli.train examples/phi/phi-qlora.yml