NanoCode012
281dc3df59
Merge pull request #167 from NanoCode012/fix/redundant-save-eval-steps
...
Fix: Refactor out unmodified save_steps and eval_steps
2023-06-09 01:39:33 +09:00
NanoCode012
2ef4634d45
Refactor out unmodified save_steps and eval_steps
2023-06-09 01:23:13 +09:00
NanoCode012
7eae90333e
Merge pull request #166 from NanoCode012/fix/seed
...
Fix: Set to use cfg.seed or 42 for seed
2023-06-09 01:15:08 +09:00
NanoCode012
c8242de725
Merge pull request #132 from utensil/falcon-7b-qlora
...
Axolotl supports falcon + qlora
2023-06-09 01:14:03 +09:00
NanoCode012
2cfe9e9b16
Set to use cfg.seed or 42 for backward compat
2023-06-09 01:02:36 +09:00
Utensil
79a8f52181
Trim trailing whitespace
2023-06-08 23:48:57 +08:00
NanoCode012
afaa0d2c01
Merge pull request #164 from NanoCode012/fix/falcon-fsdp-validate
...
Fix: Validate falcon with fsdp
2023-06-09 00:44:12 +09:00
NanoCode012
bfd27ba55e
Fix failing test
2023-06-09 00:35:03 +09:00
NanoCode012
babf0fdb71
Validate falcon with fsdp
2023-06-09 00:29:04 +09:00
Utensil
a52f4816b0
Default wandb_project to empty as suggested
...
Co-authored-by: NanoCode012 <kevinvong@rocketmail.com >
2023-06-08 23:04:19 +08:00
NanoCode012
81911d112c
Merge pull request #163 from NanoCode012/feat/matmul-tf32
...
Feat: Set matmul tf32=True when tf32 passed
2023-06-09 00:01:31 +09:00
NanoCode012
52765ac588
Set matmul tf32
2023-06-08 23:41:12 +09:00
NanoCode012
73e9ea4069
Merge pull request #143 from NanoCode012/fix/deprecate-prepare-8bit-training
...
Fix future deprecate prepare_model_for_int8_training
2023-06-08 23:07:53 +09:00
NanoCode012
f8d379883d
Merge pull request #162 from NanoCode012/fix/custom-prompt-readme
...
Fix: Move custom prompts out of hidden
2023-06-08 23:05:17 +09:00
NanoCode012
04a1b77307
Merge pull request #161 from NanoCode012/fix/peft-setup
...
Fix: Update peft and gptq instruction
2023-06-08 23:01:53 +09:00
NanoCode012
2097a09d2d
Move custom prompts out of hidden
2023-06-08 22:53:56 +09:00
NanoCode012
cfff94b123
Add peft install for quickstart
2023-06-08 22:50:20 +09:00
NanoCode012
2b222de5b6
Update peft and gptq instruction
2023-06-08 22:48:26 +09:00
NanoCode012
df9528f865
Fix future deprecate prepare_model_for_int8_training
2023-06-08 21:42:10 +09:00
Angainor Development
193c73bce0
Fix training over existing lora
...
When training with Lora, and starting with an existing lora weights, current code produces a model with 0 trainable params and training can't work.
Adding the "is_trainable" param allows the loaded peft to be trained and fixes the bug.
2023-06-08 09:18:58 +02:00
Wing Lian
6abfd87d44
Merge pull request #158 from OpenAccess-AI-Collective/prompter-fixes
...
fix camel ai, add guanaco/oasst mapping for sharegpt
2023-06-07 11:02:30 -04:00
Wing Lian
59bb2197ed
fix camel ai, add guanaco/oasst mapping for sharegpt
2023-06-07 09:51:29 -04:00
Wing Lian
9a02e7e1ff
Merge pull request #155 from OpenAccess-AI-Collective/misc-fixes
...
new prompters, misc fixes for output dir missing using fsdp, and changing max seq len
2023-06-06 16:52:39 -04:00
Wing Lian
5b33e295bd
update docs
2023-06-05 22:48:16 -04:00
Wing Lian
4ac9e251b7
new prompters, misc fixes for output dir missing using fsdp, and changing max seq len
2023-06-05 22:41:00 -04:00
Utensil
c9c050316f
Default micro_batch_size to 1 for a safer start
2023-06-03 17:26:33 +08:00
Utensil
ca11ae9689
Add comments/alternatives for falcon-qlora configs
2023-06-03 15:04:02 +08:00
Wing Lian
328c3bce96
Merge pull request #149 from OpenAccess-AI-Collective/docker-clone-axolotl
...
clone in docker
2023-06-02 15:15:30 -04:00
Wing Lian
5cd2126439
shallow clone
2023-06-02 14:54:28 -04:00
Wing Lian
12620f3089
clone in docker
2023-06-02 14:52:50 -04:00
Wing Lian
4ab0c8b201
Merge pull request #148 from OpenAccess-AI-Collective/fix-device-load
2023-06-02 14:37:17 -04:00
Wing Lian
74ebbf4371
fix device map
2023-06-02 14:29:08 -04:00
Wing Lian
76a70fd739
Merge pull request #147 from OpenAccess-AI-Collective/winglian-rocker-images
...
Update README.md for correct image tags
2023-06-02 14:10:40 -04:00
Wing Lian
618816d4df
Update README.md for correct image tags
2023-06-02 14:10:23 -04:00
Wing Lian
91992cb8f5
Merge pull request #146 from FarisHijazi/main
...
added docker-compose file
2023-06-02 13:58:23 -04:00
FarisHijazi
84169d15b3
added docker-compose file
2023-06-02 18:17:43 +03:00
Wing Lian
ecfe8d0a1a
Merge pull request #142 from NanoCode012/feat/custom-prompt-readme
...
Feat: Add custom prompt readme and add missing prompt strategies to Readme
2023-06-02 07:21:04 -04:00
Wing Lian
eee44a3b47
Merge pull request #141 from NanoCode012/feat/lambdalabs-readme
...
Feat: Add lambdalabs instruction
2023-06-02 07:20:12 -04:00
NanoCode012
078a43eef8
Remove redundant instruction
2023-06-02 12:30:11 +09:00
NanoCode012
33e1890086
Add pygmalion
2023-06-02 12:27:51 +09:00
NanoCode012
1c38253692
Add other prompt_strategies
2023-06-02 12:24:44 +09:00
NanoCode012
496b83f778
Add short instruction for custom prompts
2023-06-02 12:16:20 +09:00
NanoCode012
ff68a95781
Add lambdalabs instruction
2023-06-02 12:09:40 +09:00
Utensil
fb3d40f197
falcon + qlora + xformer mbs 40 gas 2 on A6000
2023-06-01 18:29:20 +08:00
NanoCode012
288fd62431
Merge pull request #135 from NanoCode012/fix/grad-accu-readme
...
Fix: Update doc for grad_accu and add validation tests for batch size
2023-06-01 06:33:05 +09:00
NanoCode012
3c71c8debe
Update doc for grad_accu and add validation tests for batch size
2023-06-01 06:13:47 +09:00
Wing Lian
a6f5e5eaec
Merge pull request #134 from OpenAccess-AI-Collective/gas-batch-fix
...
fix batch size calculation
2023-05-31 14:24:48 -04:00
Wing Lian
5a631b305b
fix batch size calculation
2023-05-31 14:11:32 -04:00
Wing Lian
f94dd626f0
Merge pull request #130 from OpenAccess-AI-Collective/gas
...
swap batch size for gradient accumulation steps to decouple from num gpu
2023-05-31 13:03:51 -04:00
Wing Lian
5079753b7a
Merge pull request #131 from OpenAccess-AI-Collective/fix-packing-mask
...
fix packing so that concatenated sequences reset the attention
2023-05-31 13:03:37 -04:00