Files

Joe Cummings 1d70f24b50 Add shifted sparse attention (#973 ) [skip-ci]

* Add s2_attn to hijack flash code

* Refactor code to account for s2_attn

* Add test for models utils

* Add ``s2_attention`` option to llama configs

* Add ``s2_attention`` option to README config

* Format code to appease linter

* chore: lint

* Remove xpos and llama-landmark [bad merge]

* add e2e smoke tests for shifted sparse attention

* remove stray patch from merge

* update yml with link to paper for s2_attention/longlora

* fix assertion check for full fine tune

* increase sequence len for tests and PR feedback updates

* reduce context len to 16k for tests

* reduce context len to 16k for tests

* reduce batch size for larger context len and udpate test to check message

* fix test for message

---------

Co-authored-by: joecummings <jrcummings@devvm050.nha0.facebook.com>
Co-authored-by: Wing Lian <wing.lian@gmail.com>

2024-01-18 10:16:07 -05:00

config.yml

new evals_per_epoch and saves_per_epoch to make things cleaner (#944 )

2023-12-12 15:35:23 -05:00

lora.yml

Add shifted sparse attention (#973 ) [skip-ci]

2024-01-18 10:16:07 -05:00

qlora.yml

new evals_per_epoch and saves_per_epoch to make things cleaner (#944 )

2023-12-12 15:35:23 -05:00

README.md

Fix config path after config moved

2023-06-12 17:06:15 +09:00

README.md

openllama-3b

Basic full tune

accelerate launch scripts/finetune.py examples/openllama-3b/config.yml

LoRA

accelerate launch scripts/finetune.py examples/openllama-3b/lora.yml

QLoRA

accelerate launch scripts/finetune.py examples/openllama-3b/qlora.yml