Commit Graph

  • f31a338cbb Merge pull request #191 from OpenAccess-AI-Collective/NanoCode012-patch-1 NanoCode012 2023-06-12 02:55:37 +09:00
  • 4cd1deeef2 Add save_steps and eval_steps to Readme NanoCode012 2023-06-12 02:44:46 +09:00
  • 9ac16ed8d1 Merge pull request #190 from OpenAccess-AI-Collective/fixes-20230711-v2 Wing Lian 2023-06-11 13:27:08 -04:00
  • 6b3f509d9e forgot to add this file Wing Lian 2023-06-11 11:50:12 -04:00
  • 336aa3fd48 gptq lora llama is obviously good Wing Lian 2023-06-11 11:05:29 -04:00
  • d0d7eaa4f3 update openllama and clean up paths Wing Lian 2023-06-11 11:03:31 -04:00
  • a6ebf57e82 fix table formatting Wing Lian 2023-06-11 10:55:32 -04:00
  • 280832cec2 more matrix updates Wing Lian 2023-06-11 10:52:36 -04:00
  • a43bae9ff0 update the support matrix Wing Lian 2023-06-11 10:44:03 -04:00
  • effbbf6dd1 more pruning Wing Lian 2023-06-11 10:38:24 -04:00
  • c9a149f9e8 add check for attr Wing Lian 2023-06-11 10:11:17 -04:00
  • c530e4b9c8 more config pruning and migrating Wing Lian 2023-06-11 10:09:05 -04:00
  • f620706776 Merge pull request #189 from OpenAccess-AI-Collective/fixes-20230711 Wing Lian 2023-06-11 09:49:23 -04:00
  • 77762a5d6b get rid of some configs, formalize pythioa lora config Wing Lian 2023-06-11 09:41:41 -04:00
  • 14668fa54e new validation for mpt w grad checkpoints Wing Lian 2023-06-11 09:26:10 -04:00
  • b565ecf0a1 Fix strict and Lint AngainorDev 2023-06-11 15:23:38 +02:00
  • fe0b76854e match up gradient checkpointing when using lora w config Wing Lian 2023-06-11 09:20:40 -04:00
  • e944311442 Merge pull request #186 from akj2018/main NanoCode012 2023-06-11 19:45:06 +09:00
  • e3e7b52a5b Update FAQS.md Akshay Jain 2023-06-10 23:36:14 -07:00
  • 974dc00a7d Fix set mem_id for inference and refactor NanoCode012 2023-06-11 14:00:54 +09:00
  • 572d1141e6 Set mem cache args on inference NanoCode012 2023-06-11 12:05:37 +09:00
  • a6190c8094 Clean up landmark patching NanoCode012 2023-06-11 11:59:03 +09:00
  • 563b6d89e6 Fix undefined LlamaForCausalLM and del try except NanoCode012 2023-06-11 11:58:31 +09:00
  • cd0a6f6027 peft no longer needs device_map Wing Lian 2023-06-10 22:50:09 -04:00
  • 0e664a5ebc Update FAQS.md Akshay Jain 2023-06-10 19:26:12 -07:00
  • dd7d16d2eb Update FAQS.md Akshay Jain 2023-06-10 19:15:50 -07:00
  • e285e24f7f Address PR suggestion NanoCode012 2023-06-11 10:52:12 +09:00
  • 919727b4d7 Refactor landmark attention patch NanoCode012 2023-06-10 08:09:29 +09:00
  • 5ffefee37f Update FAQS.md Akshay Jain 2023-06-10 18:34:54 -07:00
  • d9f713e4e3 Merge pull request #183 from OpenAccess-AI-Collective/inference-from-stdin Wing Lian 2023-06-10 17:06:55 -04:00
  • 958da70376 fix formatting Wing Lian 2023-06-10 15:28:08 -04:00
  • c4e4f8115c pass a prompt in from stdin for inference Wing Lian 2023-06-10 15:07:40 -04:00
  • a808bf913f Fix missing cfg. Angainor Development 2023-06-10 20:28:49 +02:00
  • 01248253a3 Merge pull request #182 from OpenAccess-AI-Collective/fix-llama-ref Wing Lian 2023-06-10 14:25:51 -04:00
  • 759e8673ce Update scripts/finetune.py Wing Lian 2023-06-10 14:25:21 -04:00
  • 0c6f928601 address PR feedback Wing Lian 2023-06-10 14:21:43 -04:00
  • eea2731a5e add streaming dataset support for pretraining datasets Wing Lian 2023-06-09 20:25:38 -04:00
  • 1db46a9c72 linting fix Wing Lian 2023-06-08 22:05:06 -04:00
  • ab5cd28acf more gpt-neox long ctx fixes Wing Lian 2023-06-01 08:20:08 -04:00
  • 1a82082e91 fix bettertransformers save, force it to skip after saving correctly in callback Wing Lian 2023-06-01 00:33:13 -04:00
  • 1210dc8fd5 more tweaks to do pre-training with bettertransformers Wing Lian 2023-05-31 21:59:15 -04:00
  • 488a67d75a experimental expansion of ctx len Wing Lian 2023-05-31 16:51:19 -04:00
  • 71a43f8479 add validation/warning for bettertransformers and torch version Wing Lian 2023-05-28 08:56:08 -04:00
  • 39619028a3 use pythia-12b, neox-20b is flaky Wing Lian 2023-05-27 19:37:24 -04:00
  • 8792199799 add flash attn context for efficient training and attempt setting model to train mode: Wing Lian 2023-05-27 18:12:12 -04:00
  • 1edc30c786 add support for opimum bettertransformers Wing Lian 2023-05-27 17:57:29 -04:00
  • 14163c15d9 fix for local variable 'LlamaForCausalLM' referenced before assignment Wing Lian 2023-06-10 14:11:13 -04:00
  • 41e4f6ca31 Merge pull request #181 from OpenAccess-AI-Collective/xpos-rope Wing Lian 2023-06-10 14:04:03 -04:00
  • 79e2a6f140 Merge branch 'main' into patch-1 Angainor Development 2023-06-10 19:07:54 +02:00
  • c2508987a6 Remove explicit definition of cfg.inference Angainor Development 2023-06-10 19:06:10 +02:00
  • 215d775147 Merge pull request #180 from Glavin001/feat/stream-inference Wing Lian 2023-06-10 12:04:34 -04:00
  • f36e227eaf formatting for linter Wing Lian 2023-06-10 12:00:52 -04:00
  • 5878bb1f3a add option to readme Wing Lian 2023-06-10 11:57:41 -04:00
  • a03a7d7d8b add support to extend context with xpos rope Wing Lian 2023-06-10 10:29:46 -04:00
  • fec6bcc3e6 Add streaming inference & fix stopping at EOS Glavin Wiechert 2023-06-10 08:14:47 +00:00
  • 931e606459 Merge pull request #179 from OpenAccess-AI-Collective/fix-max_seq_len Wing Lian 2023-06-09 20:52:03 -04:00
  • 7f09106437 fix for max sequence len across different model types Wing Lian 2023-06-09 20:42:33 -04:00
  • 6b50200234 Merge pull request #178 from PocketDocLabs/main NanoCode012 2023-06-10 08:26:48 +09:00
  • 16f9e28048 Update README.md to reflect current gradient checkpointing support PocketDocLabs 2023-06-09 16:10:58 -07:00
  • b9083a7fc1 Merge pull request #176 from NanoCode012/fix/peft-import NanoCode012 2023-06-10 07:56:35 +09:00
  • aefb2fc681 Fix backward compat for peft NanoCode012 2023-06-10 07:46:36 +09:00
  • b5aa8d854c Merge pull request #169 from NanoCode012/feat/landmark NanoCode012 2023-06-10 07:26:06 +09:00
  • 4d6490bce2 Merge pull request #171 from OpenAccess-AI-Collective/NanoCode012-falcon-lora-matrix NanoCode012 2023-06-09 17:58:22 +09:00
  • b242b69e10 Fix falcon support lora NanoCode012 2023-06-09 17:50:16 +09:00
  • 320beb20f4 Merge pull request #170 from OpenAccess-AI-Collective/NanoCode012-lambdalabs-fix NanoCode012 2023-06-09 16:52:27 +09:00
  • bd3b537344 Feed cfg.inference Angainor Development 2023-06-09 08:59:05 +02:00
  • 813cfa4c14 WIP: Rely on cfg.inference Angainor Development 2023-06-09 08:49:32 +02:00
  • 2e13ceff37 Improve lambda labs instruction NanoCode012 2023-06-09 15:03:08 +09:00
  • 2a801b001a Fix grad checkpoint and outputs param NanoCode012 2023-06-09 14:28:44 +09:00
  • e44c9e0b3e Fix patching via import instead of hijacking NanoCode012 2023-06-09 14:27:24 +09:00
  • 55b8542de8 Feat: Add landmark attention NanoCode012 2023-06-09 12:54:08 +09:00
  • febe902517 Merge pull request #168 from bratao/main Wing Lian 2023-06-08 22:05:56 -04:00
  • f4df266842 Disable Wandb Bruno Cabral 2023-06-08 21:02:02 -03:00
  • 281dc3df59 Merge pull request #167 from NanoCode012/fix/redundant-save-eval-steps NanoCode012 2023-06-09 01:39:33 +09:00
  • 2ef4634d45 Refactor out unmodified save_steps and eval_steps NanoCode012 2023-06-09 01:23:13 +09:00
  • 7eae90333e Merge pull request #166 from NanoCode012/fix/seed NanoCode012 2023-06-09 01:15:08 +09:00
  • c8242de725 Merge pull request #132 from utensil/falcon-7b-qlora NanoCode012 2023-06-09 01:14:03 +09:00
  • 2cfe9e9b16 Set to use cfg.seed or 42 for backward compat NanoCode012 2023-06-09 01:02:36 +09:00
  • 79a8f52181 Trim trailing whitespace Utensil 2023-06-08 23:48:57 +08:00
  • afaa0d2c01 Merge pull request #164 from NanoCode012/fix/falcon-fsdp-validate NanoCode012 2023-06-09 00:44:12 +09:00
  • bfd27ba55e Fix failing test NanoCode012 2023-06-09 00:35:03 +09:00
  • babf0fdb71 Validate falcon with fsdp NanoCode012 2023-06-09 00:29:04 +09:00
  • a52f4816b0 Default wandb_project to empty as suggested Utensil 2023-06-08 23:04:19 +08:00
  • 81911d112c Merge pull request #163 from NanoCode012/feat/matmul-tf32 NanoCode012 2023-06-09 00:01:31 +09:00
  • 52765ac588 Set matmul tf32 NanoCode012 2023-06-08 23:41:12 +09:00
  • 73e9ea4069 Merge pull request #143 from NanoCode012/fix/deprecate-prepare-8bit-training NanoCode012 2023-06-08 23:07:53 +09:00
  • f8d379883d Merge pull request #162 from NanoCode012/fix/custom-prompt-readme NanoCode012 2023-06-08 23:05:17 +09:00
  • 04a1b77307 Merge pull request #161 from NanoCode012/fix/peft-setup NanoCode012 2023-06-08 23:01:53 +09:00
  • 2097a09d2d Move custom prompts out of hidden NanoCode012 2023-06-08 22:53:56 +09:00
  • cfff94b123 Add peft install for quickstart NanoCode012 2023-06-08 22:50:20 +09:00
  • 2b222de5b6 Update peft and gptq instruction NanoCode012 2023-06-08 22:48:26 +09:00
  • df9528f865 Fix future deprecate prepare_model_for_int8_training NanoCode012 2023-06-02 12:38:57 +09:00
  • 193c73bce0 Fix training over existing lora Angainor Development 2023-06-08 09:18:58 +02:00
  • 6abfd87d44 Merge pull request #158 from OpenAccess-AI-Collective/prompter-fixes Wing Lian 2023-06-07 11:02:30 -04:00
  • 59bb2197ed fix camel ai, add guanaco/oasst mapping for sharegpt Wing Lian 2023-06-07 09:51:29 -04:00
  • 9a02e7e1ff Merge pull request #155 from OpenAccess-AI-Collective/misc-fixes Wing Lian 2023-06-06 16:52:39 -04:00
  • 5b33e295bd update docs Wing Lian 2023-06-05 22:48:16 -04:00
  • 4ac9e251b7 new prompters, misc fixes for output dir missing using fsdp, and changing max seq len Wing Lian 2023-06-05 22:41:00 -04:00
  • c9c050316f Default micro_batch_size to 1 for a safer start Utensil 2023-06-03 17:26:33 +08:00
  • ca11ae9689 Add comments/alternatives for falcon-qlora configs Utensil 2023-06-03 15:04:02 +08:00