Wing Lian
01248253a3
Merge pull request #182 from OpenAccess-AI-Collective/fix-llama-ref
...
fix for local variable 'LlamaForCausalLM' referenced before assignment
2023-06-10 14:25:51 -04:00
Wing Lian
0c6f928601
address PR feedback
2023-06-10 14:23:56 -04:00
Wing Lian
eea2731a5e
add streaming dataset support for pretraining datasets
2023-06-10 14:23:56 -04:00
Wing Lian
ab5cd28acf
more gpt-neox long ctx fixes
2023-06-10 14:23:55 -04:00
Wing Lian
1a82082e91
fix bettertransformers save, force it to skip after saving correctly in callback
2023-06-10 14:23:55 -04:00
Wing Lian
1210dc8fd5
more tweaks to do pre-training with bettertransformers
2023-06-10 14:23:55 -04:00
Wing Lian
488a67d75a
experimental expansion of ctx len
2023-06-10 14:23:53 -04:00
Wing Lian
71a43f8479
add validation/warning for bettertransformers and torch version
2023-06-10 14:22:31 -04:00
Wing Lian
1edc30c786
add support for opimum bettertransformers
2023-06-10 14:22:30 -04:00
Wing Lian
14163c15d9
fix for local variable 'LlamaForCausalLM' referenced before assignment
2023-06-10 14:11:13 -04:00
Angainor Development
79e2a6f140
Merge branch 'main' into patch-1
2023-06-10 19:07:54 +02:00
Wing Lian
a03a7d7d8b
add support to extend context with xpos rope
2023-06-10 10:29:46 -04:00
Wing Lian
7f09106437
fix for max sequence len across different model types
2023-06-09 20:42:33 -04:00
NanoCode012
aefb2fc681
Fix backward compat for peft
2023-06-10 07:46:36 +09:00
Angainor Development
813cfa4c14
WIP: Rely on cfg.inference
2023-06-09 08:49:32 +02:00
NanoCode012
2a801b001a
Fix grad checkpoint and outputs param
2023-06-09 14:28:44 +09:00
NanoCode012
e44c9e0b3e
Fix patching via import instead of hijacking
2023-06-09 14:27:24 +09:00
NanoCode012
55b8542de8
Feat: Add landmark attention
2023-06-09 12:54:08 +09:00
Bruno Cabral
f4df266842
Disable Wandb
2023-06-08 21:02:02 -03:00
NanoCode012
2ef4634d45
Refactor out unmodified save_steps and eval_steps
2023-06-09 01:23:13 +09:00
NanoCode012
2cfe9e9b16
Set to use cfg.seed or 42 for backward compat
2023-06-09 01:02:36 +09:00
NanoCode012
bfd27ba55e
Fix failing test
2023-06-09 00:35:03 +09:00
NanoCode012
babf0fdb71
Validate falcon with fsdp
2023-06-09 00:29:04 +09:00
NanoCode012
df9528f865
Fix future deprecate prepare_model_for_int8_training
2023-06-08 21:42:10 +09:00
Angainor Development
193c73bce0
Fix training over existing lora
...
When training with Lora, and starting with an existing lora weights, current code produces a model with 0 trainable params and training can't work.
Adding the "is_trainable" param allows the loaded peft to be trained and fixes the bug.
2023-06-08 09:18:58 +02:00
Wing Lian
59bb2197ed
fix camel ai, add guanaco/oasst mapping for sharegpt
2023-06-07 09:51:29 -04:00
Wing Lian
4ac9e251b7
new prompters, misc fixes for output dir missing using fsdp, and changing max seq len
2023-06-05 22:41:00 -04:00
NanoCode012
3c71c8debe
Update doc for grad_accu and add validation tests for batch size
2023-06-01 06:13:47 +09:00
Wing Lian
5a631b305b
fix batch size calculation
2023-05-31 14:11:32 -04:00
Wing Lian
9b8585dc70
fix packing so that concatenated sequences reset the attention
2023-05-31 11:38:52 -04:00
Wing Lian
2d0ba3b818
Merge pull request #124 from OpenAccess-AI-Collective/xformers-fix
...
copy xformers attn from ooba since we removed dep on alpaca_lora_4bit
2023-05-31 00:11:40 -04:00
Wing Lian
c7021e191f
Merge pull request #120 from OpenAccess-AI-Collective/model-from-path
...
split up llama model loading so config can be loaded from base config and models can be loaded from a path
2023-05-31 00:08:38 -04:00
Wing Lian
c56818b119
don't worry about dupes
2023-05-31 00:06:47 -04:00
Wing Lian
1076bcbbca
Update src/axolotl/monkeypatch/llama_attn_hijack_xformers.py
...
Co-authored-by: NanoCode012 <kevinvong@rocketmail.com >
2023-05-31 00:00:19 -04:00
Wing Lian
2daa6835f0
Update src/axolotl/monkeypatch/llama_attn_hijack_xformers.py
...
Co-authored-by: NanoCode012 <kevinvong@rocketmail.com >
2023-05-30 23:59:05 -04:00
Wing Lian
e3c494ca7b
remove unused import and update readme
2023-05-30 23:55:45 -04:00
Wing Lian
ad0ea6aaab
black formatting
...
ignore copied file
fix linting
2023-05-30 23:50:29 -04:00
Wing Lian
6cb2310592
copy xformers attn from ooba since we removed dep on alpaca_lora_4bit
2023-05-30 23:34:36 -04:00
Wing Lian
3aad5f3b3e
add support for gradient accumulation steps
2023-05-30 23:24:37 -04:00
Wing Lian
39a208c2bc
fix up tokenizer config, isort fix
2023-05-30 23:00:02 -04:00
Wing Lian
2520ecd6df
split up llama model loading so config can be loaded from base config and models can be loaded from a path
2023-05-30 22:32:44 -04:00
NanoCode012
594e72b6e8
Fix incorrect rebase
2023-05-31 02:58:50 +09:00
NanoCode012
25eeeeba0b
Fix sharegpt prompt
2023-05-31 02:55:21 +09:00
Wing Lian
cfcc549f6b
fix relative path for fixtures
2023-05-31 02:55:21 +09:00
NanoCode012
a1f9850b91
Fix security issue or ignore false positives
2023-05-31 02:53:53 +09:00
NanoCode012
c17dae6d07
Update src/axolotl/prompt_strategies/alpaca_instruct.py
...
Co-authored-by: Wing Lian <wing.lian@gmail.com >
2023-05-31 02:53:53 +09:00
NanoCode012
37293dce07
Apply isort then black
2023-05-31 02:53:53 +09:00
NanoCode012
e9650d3ae4
Fix mypy typing
2023-05-31 02:53:53 +09:00
NanoCode012
be22551435
Fix unsupported operand type(s) for |
2023-05-31 02:53:53 +09:00
NanoCode012
b832a0ac62
Black formatting
2023-05-31 02:53:53 +09:00