Commit Graph

74 Commits

Author SHA1 Message Date
Wing Lian
97cf77891e run this on self hosted runner for now
fix typo
fixes to docker build
need pip wheel
don't duplicate pip install
2023-05-07 07:21:25 -04:00
Wing Lian
e2599edab9 runs on larger git runner? 2023-05-07 04:12:47 -04:00
Wing Lian
75bc8561c0 don't push the image 2023-05-07 03:39:05 -04:00
Wing Lian
15bdbae805 run on git commit 2023-05-07 03:37:59 -04:00
Wing Lian
6603b3744e try docker build on gitlab
require docker in gitlab
use kaniko to build docker in gitlab
2023-05-07 03:21:08 -04:00
Wing Lian
2634689774 build dockerfile in gha 2023-05-07 02:58:21 -04:00
Wing Lian
4818380fa6 update stablelm config 2023-05-07 01:58:23 -04:00
Wing Lian
247825bd57 refactor inference, warn if model is frozen 2023-05-07 01:54:15 -04:00
Wing Lian
cb9a887047 Merge pull request #13 from winglian/dev
merge dev branch for various fixes
2023-05-07 01:48:02 -04:00
Wing Lian
a15d823b29 Merge pull request #12 from NanoCode012/feat/eval_config
Add eval_batch_size for evaluation
2023-05-07 01:46:53 -04:00
NanoCode012
0e74b6402e Add eval_batch_size for evaluation 2023-05-06 22:21:24 +09:00
Wing Lian
a10a8265ef fix log sweep lr 2023-05-03 15:06:03 -04:00
Wing Lian
9105935b00 support for multi line inference input, log sweep over learning rates 2023-05-03 13:48:54 -04:00
Wing Lian
7748f3d6da fix adam bnb optimizer grouped parameters, fix peft model 8bit conversion logic, black formatting 2023-05-01 16:31:46 -04:00
Wing Lian
fe9c29d73e install peft from main branch 2023-05-01 12:24:04 -04:00
Wing Lian
2255bb7f4f support llama-adapter zero init attention 2023-05-01 10:42:21 -04:00
Wing Lian
55baef0e03 use prebuilt wheels for flash-attn and deepspeed 2023-05-01 09:52:03 -04:00
Wing Lian
ad2b48c0fa fdsp config dict fix, todo list, add torchdistx support 2023-04-30 13:32:07 -04:00
Wing Lian
9190ada23a 8bit and deepspeed changes 2023-04-30 06:50:35 -04:00
Wing Lian
4dbef0941f update ds_config 2023-04-30 04:24:58 -04:00
Wing Lian
6dfdd2dec0 don't load models in 8bit unless they are using an adapter, also fix tokenizer load in exceptional case 2023-04-30 03:19:56 -04:00
Wing Lian
29936bba7f fix fsdp training args 2023-04-30 00:56:28 -04:00
Wing Lian
78821815de fix for zero value warmup steps 2023-04-30 00:34:12 -04:00
Wing Lian
5159d00a86 fix sharegpt tokenization, refactor tokenization debugging 2023-04-30 00:23:53 -04:00
Wing Lian
c0f50d9c61 wire up gradient checkpointing for 4bit 2023-04-28 22:28:41 -04:00
Wing Lian
4e705eda6d Merge pull request #9 from winglian/dev
feature dump into main
2023-04-24 21:56:17 -04:00
Wing Lian
4a17a4c9a1 fix dataset handling, support galactica 2023-04-24 10:54:45 -04:00
Wing Lian
097d367af6 tweaks to data loading, 8 bit adam, accelerate and deepspeed 2023-04-24 09:41:35 -04:00
Wing Lian
4f2584f2dc shuffle and split dataset after save/load 2023-04-24 09:41:35 -04:00
Wing Lian
8d437853c8 fix sharegpt handling from hf, don't worry about loading llama if using earlier transformers release 2023-04-24 09:41:35 -04:00
Wing Lian
8e2a5609b3 stablelm support 2023-04-24 09:41:34 -04:00
Wing Lian
94f5e415a3 various bugfixes 2023-04-24 09:41:34 -04:00
Eric Hartford
2624bc2f11 ignore config, add python 3.9 (#8) 2023-04-24 07:23:19 -04:00
Wing Lian
bb991fd870 fix bug when model_type not explicitly passed 2023-04-19 13:15:33 -04:00
Wing Lian
d65385912e improve inference 2023-04-19 12:57:27 -04:00
Wing Lian
5749eb0a1c fix runpod script 2023-04-19 08:39:54 -04:00
Wing Lian
7753cdee57 cleanup empty lines, tweak env for runpod setup 2023-04-19 08:24:58 -04:00
Wing Lian
f50de1b1cb handle empty lines 2023-04-19 08:03:34 -04:00
Wing Lian
0a472e1e08 quickstart instructions for starting from runpod (#5) 2023-04-18 19:22:25 -04:00
Wing Lian
5cb7ea49a6 update readme w compat matrix 2023-04-18 14:42:37 -04:00
Wing Lian
8746b701fe attempt xformers hijack attention 2023-04-18 14:03:50 -04:00
Wing Lian
6045345d6b WIP large refactor to make finetune script a little more manageable (#3) 2023-04-18 14:01:38 -04:00
Wing Lian
81de0efc18 add support for alpaca reflect training (#2) 2023-04-18 08:34:05 -04:00
Wing Lian
34af1b465f update readme 2023-04-18 01:58:32 -04:00
Wing Lian
87d7825435 Tokenization open assistant (#1)
* refactor prompt tokenization to more easily support open assistant

* add open assisstant handling, more logging, black formatting
2023-04-18 01:45:49 -04:00
Wing Lian
eb808903e5 fix llama check 2023-04-18 01:19:53 -04:00
Wing Lian
3f3f561c06 update readme 2023-04-18 00:45:25 -04:00
Wing Lian
8f36f3cd5a fix conditional check to prevent always using 4bit 2023-04-18 00:36:03 -04:00
Wing Lian
69164da079 imrpove llama check and fix safetensors file check 2023-04-17 23:49:21 -04:00
Wing Lian
e1076430ff suppport for alpaca-like instruction datasets without inputs 2023-04-17 23:32:57 -04:00