Commit Graph

5 Commits

Author SHA1 Message Date
Sunny Liu
f62e23737b memoize dataset length for eval sample packing (#1974)
* wip on multimodal sample packing support

* wip on multimodal packing support

* llama-1b-yml

* setup logging for test

* yml

* yml

* yml

* fix for __len__ for eval sample packing

* reverted irrelavant changes

* reformatted, reverted log message

* reverted unnecessary changes

* added e2e multigpu testing for eval sample packing

* formatting

* fixed e2e test_eval params

* fix test_eval e2e multigpu

* fix test_eval e2e multigpu

* Update tests/e2e/multigpu/test_eval.py

Co-authored-by: Wing Lian <wing.lian@gmail.com>

* Update tests/e2e/multigpu/test_eval.py

Co-authored-by: Wing Lian <wing.lian@gmail.com>

---------

Co-authored-by: Wing Lian <wing.lian@gmail.com>
2024-10-17 15:15:29 -04:00
Wing Lian
ec4272c3a0 add ds zero3 to multigpu biweekly tests (#1900)
* add ds zero3 to multigpu biweekly tests

* fix for upstream api change

* use updated accelerate and fix deepspeed tests

* stringify the Path, and run multigpu tests if the multigpu tests change for a PR

* use correct json rather than yaml

* revert accelerate for deepspeed
2024-10-13 17:34:37 -04:00
Wing Lian
3853ab7ae9 bump accelerate to 0.34.2 (#1901)
* bump accelerate

* add fixture to predownload the test model

* change fixture
2024-09-07 14:39:31 -04:00
Wing Lian
5b0b774e38 ensure that the bias is also in the correct dtype (#1848) [skip ci]
* ensure that the bias is also in the correct dtype

* add nightly for dpo-qlora-fsdp
2024-08-22 11:45:00 -04:00
Wing Lian
54392ac8a6 Attempt to run multigpu in PR CI for now to ensure it works (#1815) [skip ci]
* Attempt to run multigpu in PR CI for now to ensure it works

* fix yaml file

* forgot to include multigpu tests

* fix call to cicd.multigpu

* dump dictdefault to dict for yaml conversion

* use to_dict instead of casting

* 16bit-lora w flash attention, 8bit lora seems problematic

* add llama fsdp test

* more tests

* Add test for qlora + fsdp with prequant

* limit accelerate to 2 processes and disable broken qlora+fsdp+bnb test

* move multigpu tests to biweekly
2024-08-09 11:50:13 -04:00