Wing Lian
8002ffb41f
Merge pull request #177 from NanoCode012/fix/landmark-patch
...
Fix landmark attention patch
2023-06-12 08:27:12 -04:00
Wing Lian
74ef5cc083
Merge pull request #192 from OpenAccess-AI-Collective/sharegpt-custom-prompt
...
misc fixes
2023-06-12 08:26:38 -04:00
Wing Lian
5e616d91c0
Merge branch 'main' into strip-peft-device-map
2023-06-12 08:25:54 -04:00
NanoCode012
8e568bbdae
Merge pull request #159 from AngainorDev/patch-1
...
Fix training over existing lora
2023-06-12 20:27:11 +09:00
Wing Lian
c7dee56b87
add typehints
2023-06-11 19:52:34 -04:00
Wing Lian
aac4b7691e
add new sharegpt, refactor prompt so it can be customized later, add exception if no data is processed
2023-06-11 19:42:25 -04:00
Wing Lian
c9a149f9e8
add check for attr
2023-06-11 10:11:17 -04:00
Wing Lian
14668fa54e
new validation for mpt w grad checkpoints
2023-06-11 09:26:10 -04:00
AngainorDev
b565ecf0a1
Fix strict and Lint
2023-06-11 15:23:38 +02:00
Wing Lian
fe0b76854e
match up gradient checkpointing when using lora w config
2023-06-11 09:20:40 -04:00
NanoCode012
974dc00a7d
Fix set mem_id for inference and refactor
2023-06-11 14:00:54 +09:00
NanoCode012
a6190c8094
Clean up landmark patching
2023-06-11 11:59:03 +09:00
NanoCode012
563b6d89e6
Fix undefined LlamaForCausalLM and del try except
2023-06-11 11:58:31 +09:00
Wing Lian
cd0a6f6027
peft no longer needs device_map
2023-06-10 22:50:09 -04:00
NanoCode012
e285e24f7f
Address PR suggestion
...
Co-authored-by: Wing Lian <wing.lian@gmail.com >
2023-06-11 10:52:12 +09:00
NanoCode012
919727b4d7
Refactor landmark attention patch
2023-06-11 10:51:05 +09:00
Wing Lian
958da70376
fix formatting
2023-06-10 15:28:08 -04:00
Angainor Development
a808bf913f
Fix missing cfg.
2023-06-10 20:28:49 +02:00
Wing Lian
01248253a3
Merge pull request #182 from OpenAccess-AI-Collective/fix-llama-ref
...
fix for local variable 'LlamaForCausalLM' referenced before assignment
2023-06-10 14:25:51 -04:00
Wing Lian
0c6f928601
address PR feedback
2023-06-10 14:23:56 -04:00
Wing Lian
eea2731a5e
add streaming dataset support for pretraining datasets
2023-06-10 14:23:56 -04:00
Wing Lian
ab5cd28acf
more gpt-neox long ctx fixes
2023-06-10 14:23:55 -04:00
Wing Lian
1a82082e91
fix bettertransformers save, force it to skip after saving correctly in callback
2023-06-10 14:23:55 -04:00
Wing Lian
1210dc8fd5
more tweaks to do pre-training with bettertransformers
2023-06-10 14:23:55 -04:00
Wing Lian
488a67d75a
experimental expansion of ctx len
2023-06-10 14:23:53 -04:00
Wing Lian
71a43f8479
add validation/warning for bettertransformers and torch version
2023-06-10 14:22:31 -04:00
Wing Lian
1edc30c786
add support for opimum bettertransformers
2023-06-10 14:22:30 -04:00
Wing Lian
14163c15d9
fix for local variable 'LlamaForCausalLM' referenced before assignment
2023-06-10 14:11:13 -04:00
Angainor Development
79e2a6f140
Merge branch 'main' into patch-1
2023-06-10 19:07:54 +02:00
Wing Lian
a03a7d7d8b
add support to extend context with xpos rope
2023-06-10 10:29:46 -04:00
Wing Lian
7f09106437
fix for max sequence len across different model types
2023-06-09 20:42:33 -04:00
NanoCode012
aefb2fc681
Fix backward compat for peft
2023-06-10 07:46:36 +09:00
Angainor Development
813cfa4c14
WIP: Rely on cfg.inference
2023-06-09 08:49:32 +02:00
NanoCode012
2a801b001a
Fix grad checkpoint and outputs param
2023-06-09 14:28:44 +09:00
NanoCode012
e44c9e0b3e
Fix patching via import instead of hijacking
2023-06-09 14:27:24 +09:00
NanoCode012
55b8542de8
Feat: Add landmark attention
2023-06-09 12:54:08 +09:00
Bruno Cabral
f4df266842
Disable Wandb
2023-06-08 21:02:02 -03:00
NanoCode012
2ef4634d45
Refactor out unmodified save_steps and eval_steps
2023-06-09 01:23:13 +09:00
NanoCode012
2cfe9e9b16
Set to use cfg.seed or 42 for backward compat
2023-06-09 01:02:36 +09:00
NanoCode012
bfd27ba55e
Fix failing test
2023-06-09 00:35:03 +09:00
NanoCode012
babf0fdb71
Validate falcon with fsdp
2023-06-09 00:29:04 +09:00
NanoCode012
df9528f865
Fix future deprecate prepare_model_for_int8_training
2023-06-08 21:42:10 +09:00
Angainor Development
193c73bce0
Fix training over existing lora
...
When training with Lora, and starting with an existing lora weights, current code produces a model with 0 trainable params and training can't work.
Adding the "is_trainable" param allows the loaded peft to be trained and fixes the bug.
2023-06-08 09:18:58 +02:00
Wing Lian
59bb2197ed
fix camel ai, add guanaco/oasst mapping for sharegpt
2023-06-07 09:51:29 -04:00
Wing Lian
4ac9e251b7
new prompters, misc fixes for output dir missing using fsdp, and changing max seq len
2023-06-05 22:41:00 -04:00
NanoCode012
3c71c8debe
Update doc for grad_accu and add validation tests for batch size
2023-06-01 06:13:47 +09:00
Wing Lian
5a631b305b
fix batch size calculation
2023-05-31 14:11:32 -04:00
Wing Lian
9b8585dc70
fix packing so that concatenated sequences reset the attention
2023-05-31 11:38:52 -04:00
Wing Lian
2d0ba3b818
Merge pull request #124 from OpenAccess-AI-Collective/xformers-fix
...
copy xformers attn from ooba since we removed dep on alpaca_lora_4bit
2023-05-31 00:11:40 -04:00
Wing Lian
c7021e191f
Merge pull request #120 from OpenAccess-AI-Collective/model-from-path
...
split up llama model loading so config can be loaded from base config and models can be loaded from a path
2023-05-31 00:08:38 -04:00