Commit Graph

98 Commits

Author SHA1 Message Date
NanoCode012
fe582df7d3 Fix BNB OOM by pinning version 2023-05-09 02:10:31 +09:00
Wing Lian
7576d85c73 fix to cd to path in docker 2023-05-08 03:43:46 -04:00
Wing Lian
3b4b476828 use existing state of repo to build, not the checkout 2023-05-08 03:29:48 -04:00
Wing Lian
b5fe063687 fix base for dockerfile 2023-05-08 03:27:10 -04:00
Wing Lian
a12fb0a8da Jeopardy bot! (#17)
* support for jeopardy dataset

* commit the final config for jeopardy bot
2023-05-08 03:21:40 -04:00
Wing Lian
a4329b1068 fix #16 load best model setting when using 8bit 2023-05-07 18:30:48 -04:00
Wing Lian
550502b321 use micro batch size for eval size if not specified 2023-05-07 18:26:05 -04:00
Wing Lian
fae36c7111 blah, wrong base tag 2023-05-07 17:54:26 -04:00
Wing Lian
a31746baa2 whoops, build from base image 2023-05-07 17:47:54 -04:00
Wing Lian
17345c8a4b hanging slash typo 2023-05-07 17:38:56 -04:00
Wing Lian
9cd5d3fcfc build on self hosted GPU runners 2023-05-07 17:25:31 -04:00
Wing Lian
990bec63e6 docker layer caching, build w axolotl from base build 2023-05-07 17:16:05 -04:00
Wing Lian
0c46806ae2 typo in git repo for pip 2023-05-07 16:00:21 -04:00
Wing Lian
66fa751c18 add huggingface packages and awscli 2023-05-07 11:51:57 -04:00
Wing Lian
21b74397de fix typo and add apex 2023-05-07 11:48:47 -04:00
Wing Lian
3f11b47488 needs libaio-dev from apt 2023-05-07 11:23:43 -04:00
Wing Lian
ece46b2504 pip install packaging dep 2023-05-07 11:09:03 -04:00
Wing Lian
92d800a394 build dependencies and aws-cli 2023-05-07 11:02:26 -04:00
Wing Lian
2734e3f1a2 build base separately
fix arg order for image
fix dockerfile var excaping
move args around
2023-05-07 10:56:12 -04:00
Wing Lian
14ebd2e007 build base too 2023-05-07 09:48:41 -04:00
Wing Lian
4a79dabff0 fix push to docker hub 2023-05-07 08:52:49 -04:00
Wing Lian
47ad3890bc fix whitespace and instruction on inference 2023-05-07 08:28:15 -04:00
Wing Lian
76b24bca2e push to docker hub
set docker image name
2023-05-07 08:06:50 -04:00
Wing Lian
73450d9de7 TORCH_CUDA_ARCH_LIST should be an ARG 2023-05-07 07:28:57 -04:00
Wing Lian
97cf77891e run this on self hosted runner for now
fix typo
fixes to docker build
need pip wheel
don't duplicate pip install
2023-05-07 07:21:25 -04:00
Wing Lian
e2599edab9 runs on larger git runner? 2023-05-07 04:12:47 -04:00
Wing Lian
75bc8561c0 don't push the image 2023-05-07 03:39:05 -04:00
Wing Lian
15bdbae805 run on git commit 2023-05-07 03:37:59 -04:00
Wing Lian
6603b3744e try docker build on gitlab
require docker in gitlab
use kaniko to build docker in gitlab
2023-05-07 03:21:08 -04:00
Wing Lian
2634689774 build dockerfile in gha 2023-05-07 02:58:21 -04:00
Wing Lian
4818380fa6 update stablelm config 2023-05-07 01:58:23 -04:00
Wing Lian
247825bd57 refactor inference, warn if model is frozen 2023-05-07 01:54:15 -04:00
Wing Lian
cb9a887047 Merge pull request #13 from winglian/dev
merge dev branch for various fixes
2023-05-07 01:48:02 -04:00
Wing Lian
a15d823b29 Merge pull request #12 from NanoCode012/feat/eval_config
Add eval_batch_size for evaluation
2023-05-07 01:46:53 -04:00
NanoCode012
0e74b6402e Add eval_batch_size for evaluation 2023-05-06 22:21:24 +09:00
Wing Lian
a10a8265ef fix log sweep lr 2023-05-03 15:06:03 -04:00
Wing Lian
9105935b00 support for multi line inference input, log sweep over learning rates 2023-05-03 13:48:54 -04:00
Wing Lian
7748f3d6da fix adam bnb optimizer grouped parameters, fix peft model 8bit conversion logic, black formatting 2023-05-01 16:31:46 -04:00
Wing Lian
fe9c29d73e install peft from main branch 2023-05-01 12:24:04 -04:00
Wing Lian
2255bb7f4f support llama-adapter zero init attention 2023-05-01 10:42:21 -04:00
Wing Lian
55baef0e03 use prebuilt wheels for flash-attn and deepspeed 2023-05-01 09:52:03 -04:00
Wing Lian
ad2b48c0fa fdsp config dict fix, todo list, add torchdistx support 2023-04-30 13:32:07 -04:00
Wing Lian
9190ada23a 8bit and deepspeed changes 2023-04-30 06:50:35 -04:00
Wing Lian
4dbef0941f update ds_config 2023-04-30 04:24:58 -04:00
Wing Lian
6dfdd2dec0 don't load models in 8bit unless they are using an adapter, also fix tokenizer load in exceptional case 2023-04-30 03:19:56 -04:00
Wing Lian
29936bba7f fix fsdp training args 2023-04-30 00:56:28 -04:00
Wing Lian
78821815de fix for zero value warmup steps 2023-04-30 00:34:12 -04:00
Wing Lian
5159d00a86 fix sharegpt tokenization, refactor tokenization debugging 2023-04-30 00:23:53 -04:00
Wing Lian
c0f50d9c61 wire up gradient checkpointing for 4bit 2023-04-28 22:28:41 -04:00
Wing Lian
4e705eda6d Merge pull request #9 from winglian/dev
feature dump into main
2023-04-24 21:56:17 -04:00