Wing Lian
|
c4cf567b55
|
Merge branch 'main' into quadratic-warmup
|
2023-07-10 12:42:12 -04:00 |
|
Wing Lian
|
c49729d2bc
|
better configuration for quadratic warmup
|
2023-07-10 11:52:59 -04:00 |
|
Wing Lian
|
19cf0bda99
|
params are adam_*, not adamw_*
|
2023-07-08 12:13:39 -04:00 |
|
Wing Lian
|
d69da99c2c
|
skip explicit model type too if using trust_remote_code
|
2023-07-07 21:33:11 -04:00 |
|
Wing Lian
|
66afb76a15
|
don't use llama if trust_remote_code is set since that needs to use AutoModel path
|
2023-07-07 21:31:02 -04:00 |
|
Wing Lian
|
b9b7d4ce92
|
Merge pull request #221 from utensil/local_dataset
[WIP] Support loading data files from a local directory
|
2023-07-03 09:10:13 -04:00 |
|
NanoCode012
|
e79c8e617e
|
Fix future deprecation push_to_hub_model_id
|
2023-07-03 12:44:29 +09:00 |
|
Wing Lian
|
1e5014acec
|
Merge pull request #255 from OpenAccess-AI-Collective/open-orca-prompts
open orca support
|
2023-07-01 01:11:23 -04:00 |
|
Wing Lian
|
4066c78631
|
Merge pull request #246 from OpenAccess-AI-Collective/sys-prompts-instruct
add option for instruct w sys prompts
|
2023-07-01 00:27:29 -04:00 |
|
Wing Lian
|
78a1e1fa12
|
open orca support
|
2023-07-01 00:19:41 -04:00 |
|
NanoCode012
|
77bdb7d144
|
Fix typing list
|
2023-06-29 14:29:55 +09:00 |
|
Wing Lian
|
924bbfddec
|
add option for instruct w sys prompts
|
2023-06-28 22:27:17 -04:00 |
|
Wing Lian
|
f150c027e3
|
Merge pull request #224 from OpenAccess-AI-Collective/system-prompt-data
System prompt data
|
2023-06-27 17:57:43 -04:00 |
|
Wing Lian
|
612aabd8c4
|
push intermediate model checkpoints to hub
|
2023-06-27 15:40:25 -04:00 |
|
Wing Lian
|
05ab9092e3
|
skip the system prompt
|
2023-06-25 22:40:50 -04:00 |
|
Wing Lian
|
7b57ed7618
|
pylint for duplicated code for system prompts
|
2023-06-25 22:28:07 -04:00 |
|
Wing Lian
|
3a38271276
|
add tests and supoort for loader for sys prompt data
|
2023-06-25 22:28:07 -04:00 |
|
Wing Lian
|
8d20e0a3d3
|
initial wip to get sys prompt from dataset
|
2023-06-25 22:28:07 -04:00 |
|
Wing Lian
|
47d601fa23
|
optionally define whether to use_fast tokenizer
|
2023-06-25 10:19:49 -04:00 |
|
Utensil
|
9bdd30cdfd
|
Support loading data files from a local directory
ref: https://huggingface.co/docs/datasets/v2.13.0/en/package_reference/loading_methods#datasets.load_dataset.path
|
2023-06-21 08:00:58 +00:00 |
|
Wing Lian
|
cb9d3af5c0
|
add validation and tests for adamw hyperparam
|
2023-06-15 09:39:42 -04:00 |
|
Wing Lian
|
6d0ee4ba34
|
support adamw and grad norm hyperparams
|
2023-06-15 08:40:41 -04:00 |
|
Wing Lian
|
a81f52d575
|
Merge pull request #212 from OpenAccess-AI-Collective/doc-20230615-v1
add float16 docs and tweak typehints
|
2023-06-15 08:28:57 -04:00 |
|
Wing Lian
|
1925eaf1e6
|
Merge pull request #214 from OpenAccess-AI-Collective/fix-tokenizing-labels
Fix tokenizing labels
|
2023-06-15 08:13:43 -04:00 |
|
Wing Lian
|
88e17ffc50
|
add float16 docs and tweak typehints
|
2023-06-15 02:05:31 -04:00 |
|
Wing Lian
|
7925ddce86
|
bugfix for potential off by one
|
2023-06-15 01:59:33 -04:00 |
|
maciej.karasek
|
136522f9c9
|
style correction
|
2023-06-14 20:02:09 +02:00 |
|
maciej.karasek
|
556fe408b3
|
issue #205 bugfix
|
2023-06-14 16:59:57 +02:00 |
|
Wing Lian
|
4b43a66a0b
|
update alpaca_chat prompts for instructions to explainn the conversation
|
2023-06-12 18:38:38 -04:00 |
|
Wing Lian
|
7dc580b837
|
add axolotl trainer and quadratic warmup
|
2023-06-12 13:16:40 -04:00 |
|
Wing Lian
|
fd2c9814c9
|
Merge branch 'main' into flash-optimum
|
2023-06-12 13:12:15 -04:00 |
|
Wing Lian
|
93dacba228
|
Merge pull request #187 from OpenAccess-AI-Collective/strip-peft-device-map
peft no longer needs device_map
|
2023-06-12 09:10:49 -04:00 |
|
Wing Lian
|
8002ffb41f
|
Merge pull request #177 from NanoCode012/fix/landmark-patch
Fix landmark attention patch
|
2023-06-12 08:27:12 -04:00 |
|
Wing Lian
|
74ef5cc083
|
Merge pull request #192 from OpenAccess-AI-Collective/sharegpt-custom-prompt
misc fixes
|
2023-06-12 08:26:38 -04:00 |
|
Wing Lian
|
5e616d91c0
|
Merge branch 'main' into strip-peft-device-map
|
2023-06-12 08:25:54 -04:00 |
|
NanoCode012
|
8e568bbdae
|
Merge pull request #159 from AngainorDev/patch-1
Fix training over existing lora
|
2023-06-12 20:27:11 +09:00 |
|
Wing Lian
|
c7dee56b87
|
add typehints
|
2023-06-11 19:52:34 -04:00 |
|
Wing Lian
|
aac4b7691e
|
add new sharegpt, refactor prompt so it can be customized later, add exception if no data is processed
|
2023-06-11 19:42:25 -04:00 |
|
Wing Lian
|
c9a149f9e8
|
add check for attr
|
2023-06-11 10:11:17 -04:00 |
|
Wing Lian
|
14668fa54e
|
new validation for mpt w grad checkpoints
|
2023-06-11 09:26:10 -04:00 |
|
AngainorDev
|
b565ecf0a1
|
Fix strict and Lint
|
2023-06-11 15:23:38 +02:00 |
|
Wing Lian
|
fe0b76854e
|
match up gradient checkpointing when using lora w config
|
2023-06-11 09:20:40 -04:00 |
|
NanoCode012
|
974dc00a7d
|
Fix set mem_id for inference and refactor
|
2023-06-11 14:00:54 +09:00 |
|
NanoCode012
|
a6190c8094
|
Clean up landmark patching
|
2023-06-11 11:59:03 +09:00 |
|
NanoCode012
|
563b6d89e6
|
Fix undefined LlamaForCausalLM and del try except
|
2023-06-11 11:58:31 +09:00 |
|
Wing Lian
|
cd0a6f6027
|
peft no longer needs device_map
|
2023-06-10 22:50:09 -04:00 |
|
NanoCode012
|
e285e24f7f
|
Address PR suggestion
Co-authored-by: Wing Lian <wing.lian@gmail.com>
|
2023-06-11 10:52:12 +09:00 |
|
NanoCode012
|
919727b4d7
|
Refactor landmark attention patch
|
2023-06-11 10:51:05 +09:00 |
|
Wing Lian
|
958da70376
|
fix formatting
|
2023-06-10 15:28:08 -04:00 |
|
Angainor Development
|
a808bf913f
|
Fix missing cfg.
|
2023-06-10 20:28:49 +02:00 |
|