Wing Lian
|
8d20e0a3d3
|
initial wip to get sys prompt from dataset
|
2023-06-25 22:28:07 -04:00 |
|
Wing Lian
|
47d601fa23
|
optionally define whether to use_fast tokenizer
|
2023-06-25 10:19:49 -04:00 |
|
Wing Lian
|
cb9d3af5c0
|
add validation and tests for adamw hyperparam
|
2023-06-15 09:39:42 -04:00 |
|
Wing Lian
|
6d0ee4ba34
|
support adamw and grad norm hyperparams
|
2023-06-15 08:40:41 -04:00 |
|
Wing Lian
|
a81f52d575
|
Merge pull request #212 from OpenAccess-AI-Collective/doc-20230615-v1
add float16 docs and tweak typehints
|
2023-06-15 08:28:57 -04:00 |
|
Wing Lian
|
1925eaf1e6
|
Merge pull request #214 from OpenAccess-AI-Collective/fix-tokenizing-labels
Fix tokenizing labels
|
2023-06-15 08:13:43 -04:00 |
|
Wing Lian
|
88e17ffc50
|
add float16 docs and tweak typehints
|
2023-06-15 02:05:31 -04:00 |
|
Wing Lian
|
7925ddce86
|
bugfix for potential off by one
|
2023-06-15 01:59:33 -04:00 |
|
maciej.karasek
|
136522f9c9
|
style correction
|
2023-06-14 20:02:09 +02:00 |
|
maciej.karasek
|
556fe408b3
|
issue #205 bugfix
|
2023-06-14 16:59:57 +02:00 |
|
Wing Lian
|
4b43a66a0b
|
update alpaca_chat prompts for instructions to explainn the conversation
|
2023-06-12 18:38:38 -04:00 |
|
Wing Lian
|
fd2c9814c9
|
Merge branch 'main' into flash-optimum
|
2023-06-12 13:12:15 -04:00 |
|
Wing Lian
|
93dacba228
|
Merge pull request #187 from OpenAccess-AI-Collective/strip-peft-device-map
peft no longer needs device_map
|
2023-06-12 09:10:49 -04:00 |
|
Wing Lian
|
8002ffb41f
|
Merge pull request #177 from NanoCode012/fix/landmark-patch
Fix landmark attention patch
|
2023-06-12 08:27:12 -04:00 |
|
Wing Lian
|
74ef5cc083
|
Merge pull request #192 from OpenAccess-AI-Collective/sharegpt-custom-prompt
misc fixes
|
2023-06-12 08:26:38 -04:00 |
|
Wing Lian
|
5e616d91c0
|
Merge branch 'main' into strip-peft-device-map
|
2023-06-12 08:25:54 -04:00 |
|
NanoCode012
|
8e568bbdae
|
Merge pull request #159 from AngainorDev/patch-1
Fix training over existing lora
|
2023-06-12 20:27:11 +09:00 |
|
Wing Lian
|
c7dee56b87
|
add typehints
|
2023-06-11 19:52:34 -04:00 |
|
Wing Lian
|
aac4b7691e
|
add new sharegpt, refactor prompt so it can be customized later, add exception if no data is processed
|
2023-06-11 19:42:25 -04:00 |
|
Wing Lian
|
c9a149f9e8
|
add check for attr
|
2023-06-11 10:11:17 -04:00 |
|
Wing Lian
|
14668fa54e
|
new validation for mpt w grad checkpoints
|
2023-06-11 09:26:10 -04:00 |
|
AngainorDev
|
b565ecf0a1
|
Fix strict and Lint
|
2023-06-11 15:23:38 +02:00 |
|
Wing Lian
|
fe0b76854e
|
match up gradient checkpointing when using lora w config
|
2023-06-11 09:20:40 -04:00 |
|
NanoCode012
|
974dc00a7d
|
Fix set mem_id for inference and refactor
|
2023-06-11 14:00:54 +09:00 |
|
NanoCode012
|
a6190c8094
|
Clean up landmark patching
|
2023-06-11 11:59:03 +09:00 |
|
NanoCode012
|
563b6d89e6
|
Fix undefined LlamaForCausalLM and del try except
|
2023-06-11 11:58:31 +09:00 |
|
Wing Lian
|
cd0a6f6027
|
peft no longer needs device_map
|
2023-06-10 22:50:09 -04:00 |
|
NanoCode012
|
e285e24f7f
|
Address PR suggestion
Co-authored-by: Wing Lian <wing.lian@gmail.com>
|
2023-06-11 10:52:12 +09:00 |
|
NanoCode012
|
919727b4d7
|
Refactor landmark attention patch
|
2023-06-11 10:51:05 +09:00 |
|
Wing Lian
|
958da70376
|
fix formatting
|
2023-06-10 15:28:08 -04:00 |
|
Angainor Development
|
a808bf913f
|
Fix missing cfg.
|
2023-06-10 20:28:49 +02:00 |
|
Wing Lian
|
01248253a3
|
Merge pull request #182 from OpenAccess-AI-Collective/fix-llama-ref
fix for local variable 'LlamaForCausalLM' referenced before assignment
|
2023-06-10 14:25:51 -04:00 |
|
Wing Lian
|
0c6f928601
|
address PR feedback
|
2023-06-10 14:23:56 -04:00 |
|
Wing Lian
|
eea2731a5e
|
add streaming dataset support for pretraining datasets
|
2023-06-10 14:23:56 -04:00 |
|
Wing Lian
|
ab5cd28acf
|
more gpt-neox long ctx fixes
|
2023-06-10 14:23:55 -04:00 |
|
Wing Lian
|
1a82082e91
|
fix bettertransformers save, force it to skip after saving correctly in callback
|
2023-06-10 14:23:55 -04:00 |
|
Wing Lian
|
1210dc8fd5
|
more tweaks to do pre-training with bettertransformers
|
2023-06-10 14:23:55 -04:00 |
|
Wing Lian
|
488a67d75a
|
experimental expansion of ctx len
|
2023-06-10 14:23:53 -04:00 |
|
Wing Lian
|
71a43f8479
|
add validation/warning for bettertransformers and torch version
|
2023-06-10 14:22:31 -04:00 |
|
Wing Lian
|
1edc30c786
|
add support for opimum bettertransformers
|
2023-06-10 14:22:30 -04:00 |
|
Wing Lian
|
14163c15d9
|
fix for local variable 'LlamaForCausalLM' referenced before assignment
|
2023-06-10 14:11:13 -04:00 |
|
Angainor Development
|
79e2a6f140
|
Merge branch 'main' into patch-1
|
2023-06-10 19:07:54 +02:00 |
|
Wing Lian
|
a03a7d7d8b
|
add support to extend context with xpos rope
|
2023-06-10 10:29:46 -04:00 |
|
Wing Lian
|
7f09106437
|
fix for max sequence len across different model types
|
2023-06-09 20:42:33 -04:00 |
|
NanoCode012
|
aefb2fc681
|
Fix backward compat for peft
|
2023-06-10 07:46:36 +09:00 |
|
Angainor Development
|
813cfa4c14
|
WIP: Rely on cfg.inference
|
2023-06-09 08:49:32 +02:00 |
|
NanoCode012
|
2a801b001a
|
Fix grad checkpoint and outputs param
|
2023-06-09 14:28:44 +09:00 |
|
NanoCode012
|
e44c9e0b3e
|
Fix patching via import instead of hijacking
|
2023-06-09 14:27:24 +09:00 |
|
NanoCode012
|
55b8542de8
|
Feat: Add landmark attention
|
2023-06-09 12:54:08 +09:00 |
|
Bruno Cabral
|
f4df266842
|
Disable Wandb
|
2023-06-08 21:02:02 -03:00 |
|