Wing Lian
dcdec44347
Merge pull request #306 from ethanhs/xgen
...
Add XGen info to README and example config
2023-07-22 04:10:18 -04:00
Wing Lian
1066751358
don't resize embeddings to multiples of 32x by default
2023-07-22 01:52:38 -04:00
Ethan Smith
38811434e6
Add XGen info to README and example config
2023-07-21 00:44:50 -07:00
NanoCode012
165907fddb
Fix(readme): Improve wording for push model
2023-07-21 11:28:35 +09:00
NanoCode012
b64f411849
fix(readme): remove accelerate config
2023-07-18 01:31:02 +09:00
Wing Lian
469c08c9ba
Merge pull request #279 from NanoCode012/feat/multi-gpu-readme
...
Feat(readme): improve docs on multi-gpu
2023-07-16 16:08:37 -04:00
Charles Goddard
3cdd8e4122
Add dataset name to all yaml options in README
2023-07-15 13:17:37 -07:00
NanoCode012
cf5ae6b649
Feat(readme): improve docs on multi-gpu
2023-07-16 01:07:27 +09:00
Charles Goddard
46032a1a1f
Fix formatting mistake
2023-07-14 20:57:27 -07:00
Charles Goddard
8bba64258e
Add example of dataset with configuration name to README
2023-07-14 20:46:21 -07:00
NanoCode012
231031a0e1
Merge pull request #275 from NanoCode012/feat/safetensors
...
Feat: Add save_safetensors
2023-07-14 23:07:26 +09:00
NanoCode012
5491278a79
Feat: Add save_safetensors
2023-07-14 13:21:47 +09:00
NanoCode012
896c1aebcf
Feat(docs): Add model_revision arg
2023-07-14 12:56:07 +09:00
NanoCode012
41da98b982
Fix for linter
2023-07-06 23:20:11 +09:00
NanoCode012
9e64f42e0f
Fix local path loading and custom strategy type
2023-07-06 23:08:09 +09:00
NanoCode012
e79c8e617e
Fix future deprecation push_to_hub_model_id
2023-07-03 12:44:29 +09:00
Wing Lian
78a1e1fa12
open orca support
2023-07-01 00:19:41 -04:00
NanoCode012
c146880a75
Update README.md
2023-06-30 11:33:53 +09:00
Wing Lian
47d601fa23
optionally define whether to use_fast tokenizer
2023-06-25 10:19:49 -04:00
Wing Lian
c969f0a9dc
add docs
2023-06-15 08:43:20 -04:00
Wing Lian
d7635b7148
hint to what AMP means
2023-06-15 02:06:27 -04:00
Wing Lian
88e17ffc50
add float16 docs and tweak typehints
2023-06-15 02:05:31 -04:00
Wing Lian
16bb6276a5
Merge pull request #92 from OpenAccess-AI-Collective/flash-optimum
...
add support for opimum bettertransformers
2023-06-14 07:50:15 -04:00
NanoCode012
3513885f43
Fix sharegpt type
2023-06-14 01:10:58 +09:00
PocketDoc Labs
5ff547dc70
Update README.md to include a community showcase
2023-06-12 22:38:10 -07:00
mhenrichsen
34ae69989f
fix inference
2023-06-12 21:39:19 +02:00
Wing Lian
fd2c9814c9
Merge branch 'main' into flash-optimum
2023-06-12 13:12:15 -04:00
Wing Lian
74ef5cc083
Merge pull request #192 from OpenAccess-AI-Collective/sharegpt-custom-prompt
...
misc fixes
2023-06-12 08:26:38 -04:00
NanoCode012
52cde69288
Fix config path after config moved
2023-06-12 17:06:15 +09:00
Wing Lian
aac4b7691e
add new sharegpt, refactor prompt so it can be customized later, add exception if no data is processed
2023-06-11 19:42:25 -04:00
NanoCode012
4cd1deeef2
Add save_steps and eval_steps to Readme
2023-06-12 02:44:46 +09:00
Wing Lian
336aa3fd48
gptq lora llama is obviously good
2023-06-11 11:05:29 -04:00
Wing Lian
d0d7eaa4f3
update openllama and clean up paths
2023-06-11 11:03:31 -04:00
Wing Lian
a6ebf57e82
fix table formatting
2023-06-11 10:55:32 -04:00
Wing Lian
280832cec2
more matrix updates
2023-06-11 10:52:36 -04:00
Wing Lian
a43bae9ff0
update the support matrix
2023-06-11 10:44:03 -04:00
Wing Lian
c4e4f8115c
pass a prompt in from stdin for inference
2023-06-10 15:07:40 -04:00
Wing Lian
eea2731a5e
add streaming dataset support for pretraining datasets
2023-06-10 14:23:56 -04:00
Wing Lian
5878bb1f3a
add option to readme
2023-06-10 11:57:41 -04:00
PocketDocLabs
16f9e28048
Update README.md to reflect current gradient checkpointing support
...
Previously the readme stated gradient checkpointing was incompatible with 4-bit lora in the current implementation however this is no longer the case. I have replaced the warning with a link to the hugging face documentation on gradient checkpointing.
2023-06-09 16:10:58 -07:00
NanoCode012
b5aa8d854c
Merge pull request #169 from NanoCode012/feat/landmark
...
Feat: Add landmark attention
2023-06-10 07:26:06 +09:00
NanoCode012
b242b69e10
Fix falcon support lora
2023-06-09 17:50:16 +09:00
NanoCode012
2e13ceff37
Improve lambda labs instruction
2023-06-09 15:03:08 +09:00
NanoCode012
55b8542de8
Feat: Add landmark attention
2023-06-09 12:54:08 +09:00
NanoCode012
c8242de725
Merge pull request #132 from utensil/falcon-7b-qlora
...
Axolotl supports falcon + qlora
2023-06-09 01:14:03 +09:00
NanoCode012
f8d379883d
Merge pull request #162 from NanoCode012/fix/custom-prompt-readme
...
Fix: Move custom prompts out of hidden
2023-06-08 23:05:17 +09:00
NanoCode012
2097a09d2d
Move custom prompts out of hidden
2023-06-08 22:53:56 +09:00
NanoCode012
cfff94b123
Add peft install for quickstart
2023-06-08 22:50:20 +09:00
NanoCode012
2b222de5b6
Update peft and gptq instruction
2023-06-08 22:48:26 +09:00
Wing Lian
5b33e295bd
update docs
2023-06-05 22:48:16 -04:00