Commit Graph

403 Commits

Author SHA1 Message Date
Wing Lian
c4cf567b55 Merge branch 'main' into quadratic-warmup 2023-07-10 12:42:12 -04:00
Wing Lian
c49729d2bc better configuration for quadratic warmup 2023-07-10 11:52:59 -04:00
Wing Lian
19cf0bda99 params are adam_*, not adamw_* 2023-07-08 12:13:39 -04:00
Wing Lian
d69da99c2c skip explicit model type too if using trust_remote_code 2023-07-07 21:33:11 -04:00
Wing Lian
66afb76a15 don't use llama if trust_remote_code is set since that needs to use AutoModel path 2023-07-07 21:31:02 -04:00
Wing Lian
b9b7d4ce92 Merge pull request #221 from utensil/local_dataset
[WIP] Support loading data files from a local directory
2023-07-03 09:10:13 -04:00
NanoCode012
e79c8e617e Fix future deprecation push_to_hub_model_id 2023-07-03 12:44:29 +09:00
Wing Lian
1e5014acec Merge pull request #255 from OpenAccess-AI-Collective/open-orca-prompts
open orca support
2023-07-01 01:11:23 -04:00
Wing Lian
4066c78631 Merge pull request #246 from OpenAccess-AI-Collective/sys-prompts-instruct
add option for instruct w sys prompts
2023-07-01 00:27:29 -04:00
Wing Lian
78a1e1fa12 open orca support 2023-07-01 00:19:41 -04:00
NanoCode012
77bdb7d144 Fix typing list 2023-06-29 14:29:55 +09:00
Wing Lian
924bbfddec add option for instruct w sys prompts 2023-06-28 22:27:17 -04:00
Wing Lian
f150c027e3 Merge pull request #224 from OpenAccess-AI-Collective/system-prompt-data
System prompt data
2023-06-27 17:57:43 -04:00
Wing Lian
612aabd8c4 push intermediate model checkpoints to hub 2023-06-27 15:40:25 -04:00
Wing Lian
05ab9092e3 skip the system prompt 2023-06-25 22:40:50 -04:00
Wing Lian
7b57ed7618 pylint for duplicated code for system prompts 2023-06-25 22:28:07 -04:00
Wing Lian
3a38271276 add tests and supoort for loader for sys prompt data 2023-06-25 22:28:07 -04:00
Wing Lian
8d20e0a3d3 initial wip to get sys prompt from dataset 2023-06-25 22:28:07 -04:00
Wing Lian
47d601fa23 optionally define whether to use_fast tokenizer 2023-06-25 10:19:49 -04:00
Utensil
9bdd30cdfd Support loading data files from a local directory
ref:  https://huggingface.co/docs/datasets/v2.13.0/en/package_reference/loading_methods#datasets.load_dataset.path
2023-06-21 08:00:58 +00:00
Wing Lian
cb9d3af5c0 add validation and tests for adamw hyperparam 2023-06-15 09:39:42 -04:00
Wing Lian
6d0ee4ba34 support adamw and grad norm hyperparams 2023-06-15 08:40:41 -04:00
Wing Lian
a81f52d575 Merge pull request #212 from OpenAccess-AI-Collective/doc-20230615-v1
add float16 docs and tweak typehints
2023-06-15 08:28:57 -04:00
Wing Lian
1925eaf1e6 Merge pull request #214 from OpenAccess-AI-Collective/fix-tokenizing-labels
Fix tokenizing labels
2023-06-15 08:13:43 -04:00
Wing Lian
88e17ffc50 add float16 docs and tweak typehints 2023-06-15 02:05:31 -04:00
Wing Lian
7925ddce86 bugfix for potential off by one 2023-06-15 01:59:33 -04:00
maciej.karasek
136522f9c9 style correction 2023-06-14 20:02:09 +02:00
maciej.karasek
556fe408b3 issue #205 bugfix 2023-06-14 16:59:57 +02:00
Wing Lian
4b43a66a0b update alpaca_chat prompts for instructions to explainn the conversation 2023-06-12 18:38:38 -04:00
Wing Lian
7dc580b837 add axolotl trainer and quadratic warmup 2023-06-12 13:16:40 -04:00
Wing Lian
fd2c9814c9 Merge branch 'main' into flash-optimum 2023-06-12 13:12:15 -04:00
Wing Lian
93dacba228 Merge pull request #187 from OpenAccess-AI-Collective/strip-peft-device-map
peft no longer needs device_map
2023-06-12 09:10:49 -04:00
Wing Lian
8002ffb41f Merge pull request #177 from NanoCode012/fix/landmark-patch
Fix landmark attention patch
2023-06-12 08:27:12 -04:00
Wing Lian
74ef5cc083 Merge pull request #192 from OpenAccess-AI-Collective/sharegpt-custom-prompt
misc fixes
2023-06-12 08:26:38 -04:00
Wing Lian
5e616d91c0 Merge branch 'main' into strip-peft-device-map 2023-06-12 08:25:54 -04:00
NanoCode012
8e568bbdae Merge pull request #159 from AngainorDev/patch-1
Fix training over existing lora
2023-06-12 20:27:11 +09:00
Wing Lian
c7dee56b87 add typehints 2023-06-11 19:52:34 -04:00
Wing Lian
aac4b7691e add new sharegpt, refactor prompt so it can be customized later, add exception if no data is processed 2023-06-11 19:42:25 -04:00
Wing Lian
c9a149f9e8 add check for attr 2023-06-11 10:11:17 -04:00
Wing Lian
14668fa54e new validation for mpt w grad checkpoints 2023-06-11 09:26:10 -04:00
AngainorDev
b565ecf0a1 Fix strict and Lint 2023-06-11 15:23:38 +02:00
Wing Lian
fe0b76854e match up gradient checkpointing when using lora w config 2023-06-11 09:20:40 -04:00
NanoCode012
974dc00a7d Fix set mem_id for inference and refactor 2023-06-11 14:00:54 +09:00
NanoCode012
a6190c8094 Clean up landmark patching 2023-06-11 11:59:03 +09:00
NanoCode012
563b6d89e6 Fix undefined LlamaForCausalLM and del try except 2023-06-11 11:58:31 +09:00
Wing Lian
cd0a6f6027 peft no longer needs device_map 2023-06-10 22:50:09 -04:00
NanoCode012
e285e24f7f Address PR suggestion
Co-authored-by: Wing Lian <wing.lian@gmail.com>
2023-06-11 10:52:12 +09:00
NanoCode012
919727b4d7 Refactor landmark attention patch 2023-06-11 10:51:05 +09:00
Wing Lian
958da70376 fix formatting 2023-06-10 15:28:08 -04:00
Angainor Development
a808bf913f Fix missing cfg. 2023-06-10 20:28:49 +02:00