NanoCode012
|
2a801b001a
|
Fix grad checkpoint and outputs param
|
2023-06-09 14:28:44 +09:00 |
|
NanoCode012
|
e44c9e0b3e
|
Fix patching via import instead of hijacking
|
2023-06-09 14:27:24 +09:00 |
|
NanoCode012
|
55b8542de8
|
Feat: Add landmark attention
|
2023-06-09 12:54:08 +09:00 |
|
Bruno Cabral
|
f4df266842
|
Disable Wandb
|
2023-06-08 21:02:02 -03:00 |
|
NanoCode012
|
2ef4634d45
|
Refactor out unmodified save_steps and eval_steps
|
2023-06-09 01:23:13 +09:00 |
|
NanoCode012
|
2cfe9e9b16
|
Set to use cfg.seed or 42 for backward compat
|
2023-06-09 01:02:36 +09:00 |
|
NanoCode012
|
bfd27ba55e
|
Fix failing test
|
2023-06-09 00:35:03 +09:00 |
|
NanoCode012
|
babf0fdb71
|
Validate falcon with fsdp
|
2023-06-09 00:29:04 +09:00 |
|
NanoCode012
|
df9528f865
|
Fix future deprecate prepare_model_for_int8_training
|
2023-06-08 21:42:10 +09:00 |
|
Wing Lian
|
59bb2197ed
|
fix camel ai, add guanaco/oasst mapping for sharegpt
|
2023-06-07 09:51:29 -04:00 |
|
Wing Lian
|
4ac9e251b7
|
new prompters, misc fixes for output dir missing using fsdp, and changing max seq len
|
2023-06-05 22:41:00 -04:00 |
|
NanoCode012
|
3c71c8debe
|
Update doc for grad_accu and add validation tests for batch size
|
2023-06-01 06:13:47 +09:00 |
|
Wing Lian
|
5a631b305b
|
fix batch size calculation
|
2023-05-31 14:11:32 -04:00 |
|
Wing Lian
|
9b8585dc70
|
fix packing so that concatenated sequences reset the attention
|
2023-05-31 11:38:52 -04:00 |
|
Wing Lian
|
2d0ba3b818
|
Merge pull request #124 from OpenAccess-AI-Collective/xformers-fix
copy xformers attn from ooba since we removed dep on alpaca_lora_4bit
|
2023-05-31 00:11:40 -04:00 |
|
Wing Lian
|
c7021e191f
|
Merge pull request #120 from OpenAccess-AI-Collective/model-from-path
split up llama model loading so config can be loaded from base config and models can be loaded from a path
|
2023-05-31 00:08:38 -04:00 |
|
Wing Lian
|
c56818b119
|
don't worry about dupes
|
2023-05-31 00:06:47 -04:00 |
|
Wing Lian
|
1076bcbbca
|
Update src/axolotl/monkeypatch/llama_attn_hijack_xformers.py
Co-authored-by: NanoCode012 <kevinvong@rocketmail.com>
|
2023-05-31 00:00:19 -04:00 |
|
Wing Lian
|
2daa6835f0
|
Update src/axolotl/monkeypatch/llama_attn_hijack_xformers.py
Co-authored-by: NanoCode012 <kevinvong@rocketmail.com>
|
2023-05-30 23:59:05 -04:00 |
|
Wing Lian
|
e3c494ca7b
|
remove unused import and update readme
|
2023-05-30 23:55:45 -04:00 |
|
Wing Lian
|
ad0ea6aaab
|
black formatting
ignore copied file
fix linting
|
2023-05-30 23:50:29 -04:00 |
|
Wing Lian
|
6cb2310592
|
copy xformers attn from ooba since we removed dep on alpaca_lora_4bit
|
2023-05-30 23:34:36 -04:00 |
|
Wing Lian
|
3aad5f3b3e
|
add support for gradient accumulation steps
|
2023-05-30 23:24:37 -04:00 |
|
Wing Lian
|
39a208c2bc
|
fix up tokenizer config, isort fix
|
2023-05-30 23:00:02 -04:00 |
|
Wing Lian
|
2520ecd6df
|
split up llama model loading so config can be loaded from base config and models can be loaded from a path
|
2023-05-30 22:32:44 -04:00 |
|
NanoCode012
|
594e72b6e8
|
Fix incorrect rebase
|
2023-05-31 02:58:50 +09:00 |
|
NanoCode012
|
25eeeeba0b
|
Fix sharegpt prompt
|
2023-05-31 02:55:21 +09:00 |
|
Wing Lian
|
cfcc549f6b
|
fix relative path for fixtures
|
2023-05-31 02:55:21 +09:00 |
|
NanoCode012
|
a1f9850b91
|
Fix security issue or ignore false positives
|
2023-05-31 02:53:53 +09:00 |
|
NanoCode012
|
c17dae6d07
|
Update src/axolotl/prompt_strategies/alpaca_instruct.py
Co-authored-by: Wing Lian <wing.lian@gmail.com>
|
2023-05-31 02:53:53 +09:00 |
|
NanoCode012
|
37293dce07
|
Apply isort then black
|
2023-05-31 02:53:53 +09:00 |
|
NanoCode012
|
e9650d3ae4
|
Fix mypy typing
|
2023-05-31 02:53:53 +09:00 |
|
NanoCode012
|
be22551435
|
Fix unsupported operand type(s) for |
|
2023-05-31 02:53:53 +09:00 |
|
NanoCode012
|
b832a0ac62
|
Black formatting
|
2023-05-31 02:53:53 +09:00 |
|
NanoCode012
|
8e46c0fb0d
|
Refactor duplicate code between Prompter and Pygmalion
|
2023-05-31 02:53:53 +09:00 |
|
NanoCode012
|
9c6750a075
|
Lint wandb
|
2023-05-31 02:53:53 +09:00 |
|
NanoCode012
|
c2dbf2c526
|
Lint validation
|
2023-05-31 02:53:53 +09:00 |
|
NanoCode012
|
e6b57decbd
|
Lint tokenization
|
2023-05-31 02:53:53 +09:00 |
|
NanoCode012
|
fe1f4c4e7d
|
Lint schedulers
|
2023-05-31 02:53:53 +09:00 |
|
NanoCode012
|
633ff2150f
|
Lint dict
|
2023-05-31 02:53:53 +09:00 |
|
NanoCode012
|
5d86137f70
|
Lint prompt_tokenizers
|
2023-05-31 02:53:53 +09:00 |
|
NanoCode012
|
01c8a333b3
|
Lint pygmalion
|
2023-05-31 02:53:53 +09:00 |
|
NanoCode012
|
1645a4ddd5
|
Lint creative_acr
|
2023-05-31 02:53:53 +09:00 |
|
NanoCode012
|
145b060cbe
|
Lint alpaca_instruct
|
2023-05-31 02:53:53 +09:00 |
|
NanoCode012
|
8cc0aadcb8
|
Lint alpaca_chat
|
2023-05-31 02:53:53 +09:00 |
|
NanoCode012
|
6abb7f6a16
|
Lint datasets
|
2023-05-31 02:53:53 +09:00 |
|
NanoCode012
|
de2406c488
|
Lint convert.py
|
2023-05-31 02:53:53 +09:00 |
|
NanoCode012
|
ddb86ea821
|
Lint trainer.py
|
2023-05-31 02:53:53 +09:00 |
|
NanoCode012
|
f4e5d86268
|
Lint models.py
|
2023-05-31 02:53:23 +09:00 |
|
NanoCode012
|
69722aeef4
|
Remove fixme disable
|
2023-05-31 02:53:23 +09:00 |
|