-
68237ea90a
Add extra note to Readme
NanoCode012
2023-05-21 23:09:45 +09:00
-
4ee79f2641
Fix typo
NanoCode012
2023-05-21 23:03:09 +09:00
-
2b436680a0
Add new config options to Readme
NanoCode012
2023-05-21 23:01:42 +09:00
-
04d281312c
Feat: Rewrite Readme
NanoCode012
2023-05-21 22:48:37 +09:00
-
-
7e81ca720b
Update requirements.txt
Wing Lian
2023-05-24 15:44:48 -04:00
-
3960936bf7
Merge pull request #37 from Thytu/main
Wing Lian
2023-05-24 15:42:41 -04:00
-
-
88ad05df54
fix: handles AutoTokenizer from untrusted source
Valentin De Matos
2023-05-24 20:57:10 +02:00
-
-
e8aacfbd7c
more qlora support
Wing Lian
2023-05-24 14:33:18 -04:00
-
b9d07aa95a
prepare does all this already for qlora?
Wing Lian
2023-05-23 11:33:41 -04:00
-
3b4d055edd
integrate qlora? maybe?
Wing Lian
2023-05-22 23:13:33 -04:00
-
-
2ae936fbc4
fix missing fp16 kwarg
Wing Lian
2023-05-23 20:44:24 -04:00
-
fb100a9ee1
fix enum pass as value
Wing Lian
2023-05-23 11:34:03 -04:00
-
3a503770e4
Add qa style data for alpaca instructions, fix one_cycle scheduler
Wing Lian
2023-05-22 22:58:10 -04:00
-
b029a11e65
Merge pull request #34 from OpenAccess-AI-Collective/dev-unstable
Wing Lian
2023-05-22 12:14:56 -04:00
-
-
e3df3a9f5d
cuda/pytorch matrix builds
Wing Lian
2023-05-22 12:14:21 -04:00
-
f950a881e1
cuda, pytorch matrix for base builds
Wing Lian
2023-05-21 08:38:00 -04:00
-
de6da13e19
don't need to set here
Wing Lian
2023-05-22 12:12:01 -04:00
-
9493b1b137
be able to use adam bnb 8bit and one cycle scheduler w fsdp
Wing Lian
2023-05-22 09:00:49 -04:00
-
1b3e401241
Update src/axolotl/utils/models.py for info msg
Wing Lian
2023-05-21 23:01:35 -04:00
-
3457810988
Update scripts/finetune.py
Wing Lian
2023-05-21 23:00:28 -04:00
-
ae1719d30c
Update scripts/finetune.py for logging
Wing Lian
2023-05-21 23:00:23 -04:00
-
98a6781f18
Update src/axolotl/utils/data.py for spelling
Wing Lian
2023-05-21 23:00:13 -04:00
-
607a4d33f2
make sure to use train split if loading from hf
Wing Lian
2023-05-21 22:04:39 -04:00
-
99383f14a3
make one cycle lr div factor configurable
Wing Lian
2023-05-21 20:25:06 -04:00
-
0f74464652
fix new dataset prompt tokenizers
Wing Lian
2023-05-21 18:57:09 -04:00
-
e0602a9e54
add missing __init__
Wing Lian
2023-05-21 16:36:41 -04:00
-
2809f3f21b
pygmalion dataset prompts format, cached tokenized datasets should be hashed on the tokenizer too
Wing Lian
2023-05-21 16:16:09 -04:00
-
4ea9a66dbd
tokenization fixes
Wing Lian
2023-05-21 08:33:06 -04:00
-
ed37b2268d
Merge pull request #32 from NanoCode012/patch-2
Wing Lian
2023-05-20 18:21:02 -04:00
-
-
1d5ab84486
optionally be able to specify alpaca or chat style prompts
Wing Lian
2023-05-20 18:16:22 -04:00
-
-
641f8012f9
Set
half using cfg.fp16 for 4bit
NanoCode012
2023-05-20 02:29:31 +09:00
-
-
fa8bd14be4
update entrypoint and force min accelerate
Wing Lian
2023-05-18 06:25:34 -04:00
-
13650732f8
concise multiple choice and tldr summarize
Wing Lian
2023-05-17 11:29:17 -04:00
-
8c2f3cb0f8
support for replit lm
Wing Lian
2023-05-17 08:49:03 -04:00
-
b46bc02f0a
add alpaca multiple choice instruct dataset support
Wing Lian
2023-05-16 21:45:34 -04:00
-
e553c9080b
Merge pull request #29 from NanoCode012/patch-1
Wing Lian
2023-05-16 07:12:06 -04:00
-
-
2c73c81348
Add
lora_modules_to_save
NanoCode012
2023-05-16 19:22:00 +09:00
-
-
f98e173b59
reorder options so debug can happen in the same prepare step
Wing Lian
2023-05-15 22:26:30 -04:00
-
5e37144754
fix prompters, especially the sharegpt prompter
Wing Lian
2023-05-15 22:15:36 -04:00
-
bdbca8fa6c
more fixes
Wing Lian
2023-05-15 14:07:17 -04:00
-
-
42410c783c
more fixes
Wing Lian
2023-05-14 09:16:41 -04:00
-
aef00b6c13
fix torch_dtype for model load
Wing Lian
2023-05-14 08:44:22 -04:00
-
0d28df0fd2
move filter to before saving so it doesn't happen everytime, update runpod manual script
Wing Lian
2023-05-13 21:51:41 -04:00
-
84c7bc4b68
whoops, gt vs lt
Wing Lian
2023-05-12 14:03:25 -04:00
-
aa3c3f97ae
optimize dataloading to use cache, fix model token embedding sizes
Wing Lian
2023-05-12 13:53:27 -04:00
-
f6d1fa4a85
Merge pull request #25 from NanoCode012/patch-2
Wing Lian
2023-05-11 09:20:15 -04:00
-
-
89b7f26b9d
Merge branch 'main' into patch-2
NanoCode012
2023-05-11 21:18:38 +09:00
-
-
-
-
165da584b3
fix config for parity with previous change
Wing Lian
2023-05-11 08:13:09 -04:00
-
4cc7ed8898
Merge pull request #27 from NanoCode012/patch-1
Wing Lian
2023-05-11 07:27:31 -04:00
-
-
52aada7174
Fix typo
NanoCode012
2023-05-11 20:22:30 +09:00
-
-
688c73a81e
Merge pull request #26 from OpenAccess-AI-Collective/mpt-triton
Wing Lian
2023-05-10 16:02:05 -04:00
-
-
2bc1a5bde1
black formatting
Wing Lian
2023-05-10 16:01:08 -04:00
-
7a490a4646
various fixes
Wing Lian
2023-05-10 16:00:09 -04:00
-
813aab378f
Fix Trainer() got multiple values for keyword argument 'callbacks'
NanoCode012
2023-05-10 18:28:28 +09:00
-
-
-
e2e68c3965
testing mpt triton
Wing Lian
2023-05-09 20:57:40 -04:00
-
-
a27d594788
fix conditional so alpaca doesn't choke
Wing Lian
2023-05-09 20:57:07 -04:00
-
1fb0376150
Merge pull request #23 from NanoCode012/patch-1
Wing Lian
2023-05-09 15:05:58 -04:00
-
-
915c56cd97
Update finetune.py
Wing Lian
2023-05-09 15:05:39 -04:00
-
df9c5085b5
not everyone has bf16 available
Wing Lian
2023-05-09 14:47:48 -04:00
-
7967cd1039
add 4bit lora 7b
Wing Lian
2023-05-09 14:38:32 -04:00
-
cd2395987e
Don't save full model for lora
NanoCode012
2023-05-10 03:18:38 +09:00
-
71a1f7f38c
Save adapter for lora
NanoCode012
2023-05-10 01:08:22 +09:00
-
-
02c59832a3
push up redpajama 3b example
Wing Lian
2023-05-08 19:18:33 -04:00
-
3f9c9530ea
Merge pull request #15 from NanoCode012/feat/completion
Wing Lian
2023-05-08 19:04:54 -04:00
-
-
174b74ddc9
Rename variable to use same convention
NanoCode012
2023-05-09 00:54:46 +09:00
-
cf681537ec
Add CompletionPrompt type
NanoCode012
2023-05-09 00:30:36 +09:00
-
-
bd3c5a5cb3
Merge pull request #21 from NanoCode012/patch-1
Wing Lian
2023-05-08 13:34:44 -04:00
-
-
bcbc99e655
Merge pull request #19 from NanoCode012/feat/callback-save-lora
Wing Lian
2023-05-08 13:34:07 -04:00
-
-
b0d2594de9
Merge pull request #22 from NanoCode012/patch-2
Wing Lian
2023-05-08 13:33:52 -04:00
-
-
fe582df7d3
Fix BNB OOM by pinning version
NanoCode012
2023-05-09 02:10:31 +09:00
-
36aaea02b9
Update trainer.py
NanoCode012
2023-05-09 02:01:08 +09:00
-
5b6690ac25
Fix condition scheduler
NanoCode012
2023-05-09 01:44:12 +09:00
-
-
-
a125693122
add support for trust_remote_code for mpt models
Wing Lian
2023-05-08 12:07:27 -04:00
-
709be5af81
use printf instead of echo in dockerfile for portability
Wing Lian
2023-05-08 11:45:38 -04:00
-
cc77bab526
Add callbacks to Trainer
NanoCode012
2023-05-09 00:41:19 +09:00
-
0d6708bfe4
Add callback save peft_model on_save
NanoCode012
2023-05-09 00:38:27 +09:00
-
-
807cca81c0
fix path name to sorkspace
Wing Lian
2023-05-08 11:20:03 -04:00
-
79deb35c68
setup runpod images
Wing Lian
2023-05-08 10:19:51 -04:00
-
-
7576d85c73
fix to cd to path in docker
Wing Lian
2023-05-08 03:43:46 -04:00
-
3b4b476828
use existing state of repo to build, not the checkout
Wing Lian
2023-05-08 03:29:48 -04:00
-
b5fe063687
fix base for dockerfile
Wing Lian
2023-05-08 03:25:17 -04:00
-
a12fb0a8da
Jeopardy bot! (#17)
Wing Lian
2023-05-08 03:21:40 -04:00
-
a4329b1068
fix #16 load best model setting when using 8bit
Wing Lian
2023-05-07 18:30:48 -04:00
-
550502b321
use micro batch size for eval size if not specified
Wing Lian
2023-05-07 18:26:05 -04:00
-
fae36c7111
blah, wrong base tag
Wing Lian
2023-05-07 17:54:26 -04:00
-
a31746baa2
whoops, build from base image
Wing Lian
2023-05-07 17:47:54 -04:00
-
17345c8a4b
hanging slash typo
Wing Lian
2023-05-07 17:38:56 -04:00
-
9cd5d3fcfc
build on self hosted GPU runners
Wing Lian
2023-05-07 17:25:31 -04:00
-
990bec63e6
docker layer caching, build w axolotl from base build
Wing Lian
2023-05-07 17:16:05 -04:00
-
0c46806ae2
typo in git repo for pip
Wing Lian
2023-05-07 16:00:21 -04:00
-
66fa751c18
add huggingface packages and awscli
Wing Lian
2023-05-07 11:51:57 -04:00
-
21b74397de
fix typo and add apex
Wing Lian
2023-05-07 11:48:47 -04:00
-
3f11b47488
needs libaio-dev from apt
Wing Lian
2023-05-07 11:23:43 -04:00
-
ece46b2504
pip install packaging dep
Wing Lian
2023-05-07 11:09:03 -04:00
-
92d800a394
build dependencies and aws-cli
Wing Lian
2023-05-07 11:02:26 -04:00
-
2734e3f1a2
build base separately
Wing Lian
2023-05-07 10:26:29 -04:00
-
14ebd2e007
build base too
Wing Lian
2023-05-07 09:48:41 -04:00
-
4a79dabff0
fix push to docker hub
Wing Lian
2023-05-07 08:52:49 -04:00
-
47ad3890bc
fix whitespace and instruction on inference
Wing Lian
2023-05-07 08:28:15 -04:00
-
76b24bca2e
push to docker hub
Wing Lian
2023-05-07 07:57:36 -04:00