Files

Wing Lian 732851f105 Phi2 rewrite (#1058 )

* restore to current phi modeling code from phi-2

* enable gradient checkpointing

* don't cast everything to float32 all the time

* gradient checkpointing for phi2 ParallelBlock module too

* fix enabling flash attn for phi2

* add comment about import

* fix phi2 example

* fix model type check for tokenizer

* revert float32 -> bf16 casting changes

* support fused dense flash attn

* fix the repo for flash-attn

* add package name for subdir pkg

* fix the data collator when not using sample packing

* install packaging for pytests in ci

* also fix setup to not install flash attn fused dense subdir if not extras

* split out the fused-dense-lib in extra requires

* don't train w group_by_length for phi

* update integration test to use phi2

* set max steps and save steps for phi e2e tests

* try to workaround ssave issue in ci

* skip phi2 e2e test for now

2024-01-08 14:04:22 -05:00

phi2-ft.yml

Phi2 rewrite (#1058 )

2024-01-08 14:04:22 -05:00

phi-ft.yml

new evals_per_epoch and saves_per_epoch to make things cleaner (#944 )

2023-12-12 15:35:23 -05:00

phi-qlora.yml

new evals_per_epoch and saves_per_epoch to make things cleaner (#944 )

2023-12-12 15:35:23 -05:00

README.md

make phi training work with Loras (#588 )

2023-09-15 20:51:55 -04:00

README.md

Phi

Due to some nuances with the phi code, please use deepspeed when training phi for full finetune.

accelerate launch -m axolotl.cli.train examples/phi/phi-ft.yml --deepspeed deepspeed/zero1.json

# OR

python -m axolotl.cli.train examples/phi/phi-qlora.yml