Maxime
5d931cc042
Only run tests when a change to python files is made ( #614 )
...
* Update tests.yml
* Update .github/workflows/tests.yml
---------
Co-authored-by: Wing Lian <wing.lian@gmail.com >
2023-09-20 22:02:04 -04:00
Javier
ec0958f4f8
Update requirements.txt ( #610 )
2023-09-20 08:40:49 -04:00
Wing Lian
faecff9798
support to disable exllama for gptq ( #604 )
...
* support to disable exllama for gptq
* update property instead of item
* fix config key
2023-09-19 17:51:08 -04:00
bofeng huang
aa656e04bd
Delete duplicate lines ( #606 )
2023-09-19 16:40:05 -04:00
Wing Lian
b53e77775b
update dockerfile to not build evoformer since it fails the build ( #607 )
2023-09-19 16:28:29 -04:00
Wing Lian
674c57692d
more sane defaults for openllama 3b used for quickstarts ( #602 )
...
* more sane defaults for openllama 3b used for quickstarts
* don't use bf16 for quickstart to simplify gpu compatibility
* use the update openlm-research/open_llama_3b_v2 models
2023-09-19 09:15:10 -04:00
Wing Lian
1eebbd09c3
improve handling for empty text on the tokenization step ( #502 )
2023-09-19 08:09:56 -04:00
Wing Lian
62a774140b
Fix for check with cfg and merge_lora ( #600 )
2023-09-18 21:14:32 -04:00
Wing Lian
31b9e0c6e8
minor tweaks to simplify ( #597 )
2023-09-18 11:45:44 -04:00
Wing Lian
6b9b229356
btlm and falcon monkey patches for flash attn ( #566 )
2023-09-17 13:49:18 -04:00
Wing Lian
131afdbd89
add bf16 check ( #587 )
2023-09-17 13:49:03 -04:00
NanoCode012
00dce35fb2
Feat(data): Allow loading local csv and text ( #594 )
...
* Feat(data): Allow loading local csv and text
* chore: update readme for loading data
2023-09-17 11:32:27 -04:00
Wing Lian
b15b19eb8d
gather/broadcast the max value of the packing efficiency automatically ( #463 )
2023-09-17 11:08:18 -04:00
Wing Lian
ab534d75ba
don't add position_ids for evals ( #591 )
2023-09-16 16:11:57 -04:00
Wing Lian
21ec195c9f
optionally configure sample packing for evals ( #589 )
2023-09-16 00:09:48 -04:00
Wing Lian
62eaee7649
make phi training work with Loras ( #588 )
...
* valdiation for phi loras
* fix model config class check
* update readme for phi traiing
2023-09-15 20:51:55 -04:00
Jan Philipp Harries
be75668400
set fsdp state dict ( #584 )
...
Co-authored-by: Jan Philipp Harries <jphme@users.noreply.github.com >
2023-09-15 17:47:36 -04:00
Wing Lian
aeec7c4688
pop block_cls since it's not an actual kwarg
2023-09-15 15:54:06 -04:00
Wing Lian
360788296a
don't resize embeddings if it's already large enough ( #577 )
...
* don't resize embeddings if it's already large enough
* make sure to tie weights, even if we aren't resizing
2023-09-15 15:47:09 -04:00
Wing Lian
12a2dbbc2c
Support Sample packing for phi arch ( #586 )
...
* phi sequence packing
* sample packing fixes
* fix linting
* fix inference and phi e2e tests
* update phi example now that sample packing works
* wandb import keeps getting moved around
2023-09-15 15:46:54 -04:00
NanoCode012
3a2edc85c3
Feat(doc): Add features to doc ( #583 )
2023-09-16 01:14:15 +09:00
Wing Lian
f7a22632d7
support custom field for completion from yml ( #580 )
...
* support custom field for completion from yml
* remove legacy completion check and add doc
* update README docs
2023-09-15 07:48:21 -04:00
Doan Minh Phuong
1aa400721e
Fix Codellama examples ( #582 )
...
* Fix seq_len
* Update lora.yml
* Update qlora.yml
* Update lora.yml
* Update lora.yml
* Update qlora.yml
2023-09-15 04:19:13 -04:00
Wing Lian
8dcd40ac78
prevent cli functions from getting fired on import ( #581 )
2023-09-15 04:03:32 -04:00
Wing Lian
a5a625f47e
update support matrix with btlm and phi ( #579 )
2023-09-15 02:46:15 -04:00
Wing Lian
861cecac2a
refactor scripts/finetune.py into new cli modules ( #550 )
...
* refactor scripts/finetune.py into new cli modules
* continue to support scripts/finetune.py
* update readme with updated cli commands
* Update scripts/finetune.py
Co-authored-by: NanoCode012 <kevinvong@rocketmail.com >
---------
Co-authored-by: NanoCode012 <kevinvong@rocketmail.com >
2023-09-15 01:43:52 -04:00
Wing Lian
1078d3eae7
E2e passing tests ( #576 )
...
* run e2e tests after all other checks have passed
* tweak tests so they get run on PRs or push to main
* change dependent action for chcecking
* one test workflow to rule them all
* no need for custom action, just use needs
* whoops, python version should be a string
* e2e tests can run on any available gpu
2023-09-15 01:03:49 -04:00
Wing Lian
24146733db
E2e device cuda ( #575 )
...
* use torch.cuda.current_device() instead of local_rank
* ignore NVML errors for gpu stats
* llama lora packing e2e tests
2023-09-14 22:49:27 -04:00
Wing Lian
9218ebecd2
e2e testing ( #574 )
2023-09-14 21:56:11 -04:00
Wing Lian
228420972e
Phi examples ( #569 )
...
* add phi full ft example
* Add readme to point out that deepspeed should be used
* zero1 is better than zero2 for phi
2023-09-14 11:17:47 -04:00
Wing Lian
c6d870b91d
mypy wandb ignore ( #572 )
...
* mypy wandb ignore
* fix isort for wandb
2023-09-14 11:17:30 -04:00
Wing Lian
115795079d
remove columns after tokenizing for pretraining ( #571 )
2023-09-14 11:08:22 -04:00
Wing Lian
3b18c963cc
set auto for other params that hf trainer sets for ds. include zero1 json ( #570 )
2023-09-14 11:04:37 -04:00
Wing Lian
3fbde762ab
fix save_steps so it doesn't get duplicated ( #567 )
2023-09-13 20:40:33 -04:00
Wing Lian
f6060a664e
Model parallel ( #538 )
...
* model-parallel for single process
* fix device/device_map
* fix handling for device
2023-09-13 11:45:30 -04:00
Wing Lian
a4e1bb6606
let hf trainer handle torch compile ( #516 )
...
* let hf trainer handle torch compile
* remove torch compile checks, include option for backend
* suppress torch errors to get further
* require min torch version of 2.1.0 for torch compile to work
---------
Co-authored-by: Aman Karmani <aman@tmm1.net >
2023-09-13 11:42:12 -04:00
Wing Lian
36e53c7442
improve how we setup eval/save strategies and steps ( #547 )
...
* setup save end eval strategies to be consistent with trainer logic
* add comments
* better eval handling
2023-09-13 11:37:23 -04:00
Wing Lian
e7aa7b1a1e
gracefully handle length feature used for group by ( #565 )
2023-09-13 11:23:30 -04:00
Wing Lian
e5bb22a56b
add optimization for group-by-len ( #563 )
2023-09-13 10:57:12 -04:00
Wing Lian
fdb777bc06
check for the existence of the default accelerate config that can create headaches ( #561 )
2023-09-13 10:38:28 -04:00
Wing Lian
bf0804447c
fix wandb so mypy doesn't complain ( #562 )
...
* fix wandb so mypy doesn't complain
* fix wandb so mypy doesn't complain
* no need for mypy override anymore
2023-09-13 10:36:16 -04:00
Glavin Wiechert
5b67ea98a6
Add training callback to send predictions to WandB table ( #521 )
...
* WIP Add training callback to send predictions to WandB table
* WIP improve wandb table reporting callback
* WIP improve wandb table reporting callback (cont)
* Add VSCode launching for debugging
* Add tiny llama example
* WIP attempt to improve post-eval prediction generation for table
* WIP attempt to improve post-eval prediction generation for table - part 2
* WIP batch generation
* WIP attempt to handle sample_packing using position_ids for wandb prediction table
* WIP add code for debugging
* Fix sample_packing support for wandb prediction table
* Clean up code for PR review
* Add eval_table_size, eval_table_max_new_tokens configs & clean up code
* Clean up PR, delete VSCode config, add tiny-llama example
* Add eval_table_size, eval_table_max_new_tokens documentation. Fix linting/formatting
2023-09-13 09:51:08 -04:00
Jan Philipp Harries
2f586d18db
Fix pretraining with iterable/streaming Dataset ( #556 )
...
* return without packing prep/len
* fix remove columns
* fix encode arguments
* add error when max steps not set
* fix test
---------
Co-authored-by: Jan Philipp Harries <jphme@users.noreply.github.com >
2023-09-13 00:16:40 -04:00
Wing Lian
9845c5e12d
document that packaging needs to be installed before flash-attn ( #559 )
2023-09-12 12:18:30 -04:00
Wing Lian
772cd870d4
fix the sed command to replace the version w the tag
pre-commit / pre-commit (push) Has been cancelled
publish pypi / Upload release to PyPI (push) Has been cancelled
PyTest / test (3.10) (push) Has been cancelled
PyTest / test (3.9) (push) Has been cancelled
v0.3.0
2023-09-11 13:44:19 -04:00
Wing Lian
6c5fbe6223
add long_description for pypi push ( #555 )
2023-09-11 13:34:29 -04:00
Wing Lian
bcbc9597e9
replace tags, build dist for pypi publish ( #553 )
...
* replace tags, build dist for pypi publish
* missing trailing comma
2023-09-11 13:25:41 -04:00
The Objective Dad
6d57f2f0f0
ergonomic update to optimizer config doc ( #548 )
2023-09-11 12:35:45 -04:00
Wing Lian
20ed4c1f9e
pypi on tag push ( #552 )
2023-09-11 10:33:42 -04:00
Wing Lian
c5dedb17ad
remove with section, doesn't seem to work ( #551 )
2023-09-11 10:27:17 -04:00