Commit Graph

  • 0e6efaa10c fix: manually set auto-map NanoCode012 2025-02-05 19:35:15 +07:00
  • c4cb622590 fix: remove redundant files NanoCode012 2025-02-05 19:34:06 +07:00
  • 0f82bd2d18 chore: improve instruction and made linearize optional NanoCode012 2025-02-05 19:33:15 +07:00
  • 49746b184f chore: flatten directory structure and register to autoclass to save NanoCode012 2025-02-05 19:17:57 +07:00
  • 9e1c4de13c fix: assign linear head instead of loading state dict NanoCode012 2025-02-05 18:24:31 +07:00
  • 2d5f692fc0 refactor: move to modeling file and remove axolotl imports NanoCode012 2025-02-05 18:16:39 +07:00
  • 2fd5c45c2e chore: refactor register linear llama NanoCode012 2025-02-05 18:03:04 +07:00
  • 8294e6218f fix: freeze base_model and register config into Auto class NanoCode012 2025-02-05 15:59:06 +07:00
  • 253dcdd0cf fix: proprerly return causal model NanoCode012 2025-02-05 15:56:57 +07:00
  • 4cc60df876 fix: config to allow optional input NanoCode012 2025-02-05 15:52:30 +07:00
  • d0e739da24 attempt at getting around bf16 error Sunny Liu 2025-02-04 21:57:21 -05:00
  • 3f6be519d5 stack Sunny Liu 2025-02-04 21:25:13 -05:00
  • adcbc7459b misc Sunny Liu 2025-02-04 21:17:50 -05:00
  • 470ba65c44 make doc mask instead of the whole block mask in collator Sunny Liu 2025-02-04 20:27:39 -05:00
  • 753146b458 max_length moved to reward config Wing Lian 2025-02-04 11:05:00 -05:00
  • d683c50113 fix config cls Wing Lian 2025-02-04 09:08:08 -05:00
  • 234cd8311e fix failure case in prompter loading Wing Lian 2025-02-04 08:43:33 -05:00
  • f9893e3842 fix dpo config and add use_logits_to_keep Wing Lian 2025-02-04 08:39:37 -05:00
  • ac1ebc58a8 add support for num_generations Wing Lian 2025-02-03 22:10:32 -05:00
  • 56f3b9f20f bump pydantic to support vllm Wing Lian 2025-02-03 20:02:18 -05:00
  • 2c1376d8c4 don't shrink embeddings unless told to Wing Lian 2025-02-03 10:10:22 -05:00
  • 3c7517fd55 add support for passing map kwargs to dataset map in rl Wing Lian 2025-02-03 09:04:30 -05:00
  • 1e94d7ef65 more fixes to get grpo working Wing Lian 2025-02-03 08:32:44 -05:00
  • cfc7fe0df2 remove ununsable args kwargs Wing Lian 2025-02-03 00:54:52 -05:00
  • 3c4fe478cf be nice with self.cfg.dataset_processes Wing Lian 2025-02-03 00:50:09 -05:00
  • c810599c66 order matters Wing Lian 2025-02-03 00:47:48 -05:00
  • 300ffc2cb6 make it a dataclass Wing Lian 2025-02-03 00:45:58 -05:00
  • b1c4711145 load the class from strat Wing Lian 2025-02-03 00:31:55 -05:00
  • d155849e2c use correct builder Wing Lian 2025-02-03 00:25:24 -05:00
  • 626db6cb84 collator for grpo and prompt loader Wing Lian 2025-02-03 00:18:17 -05:00
  • 79159b4871 support custom module prompt strategy for rl Wing Lian 2025-02-02 23:47:00 -05:00
  • 704ddd6ff1 honor skip prepare for rl Wing Lian 2025-02-02 23:39:10 -05:00
  • 54b0d3d0e8 passthrough dataset parser for dpo/grpo Wing Lian 2025-02-02 23:36:22 -05:00
  • 59ad21f2de refactor a bit for better grpo support Wing Lian 2025-02-02 23:22:36 -05:00
  • 57264b6491 respect dotenv for cli Wing Lian 2025-02-02 22:03:00 -05:00
  • d495e41ba1 refactor dpo trainer into own module Wing Lian 2025-02-02 18:13:09 -05:00
  • 6067fe6c28 upgrade trl to 0.14.0 Wing Lian 2025-02-02 18:06:07 -05:00
  • c5313123e3 Built site for gh-pages Quarto GHA Workflow Runner 2025-02-04 14:44:36 +00:00
  • a620d481e2 fix: drop long seq even if not sample packing (#2211) NanoCode012 2025-02-04 21:43:35 +07:00
  • ca379405c1 use narrow as a view on the student logits instead of slicing kd-logits-view Wing Lian 2025-02-04 09:34:26 -05:00
  • 2bc7833a4e feat: integrate new modelling into cli NanoCode012 2025-02-04 19:46:05 +07:00
  • 1fb8d86396 fix: handle num_items_in_batch NanoCode012 2025-02-04 19:32:20 +07:00
  • adeefc1991 feat: refactor into modeling code NanoCode012 2025-02-04 19:29:42 +07:00
  • fb88269dcb fix: set model_accepts_loss_kwargs=False NanoCode012 2025-02-04 02:01:05 +07:00
  • 433cf4a8c7 fix: compute_loss return sig NanoCode012 2025-02-04 01:53:18 +07:00
  • 0b7b58c8be feat: migrate to transformers 4.48 attention sig NanoCode012 2025-02-04 01:52:35 +07:00
  • 81731adc1d fix: missing input arg NanoCode012 2025-02-04 01:51:33 +07:00
  • a1715aa317 chore: add todo NanoCode012 2025-02-03 22:47:25 +07:00
  • ce0cd470f7 feat: add convert linear attention cli NanoCode012 2025-02-03 22:46:09 +07:00
  • 311d6eb5da feat: add lolcats with fixed typed NanoCode012 2025-02-03 22:38:19 +07:00
  • 8e1adc154d stuff Sunny Liu 2025-02-02 20:36:14 -05:00
  • e5b36900e4 misc Sunny Liu 2025-02-02 20:32:03 -05:00
  • 9f6c89b12b undo my stupidity Sunny Liu 2025-02-02 20:25:53 -05:00
  • b0871c8d3b attempt - mask padding Sunny Liu 2025-02-02 20:18:49 -05:00
  • d3ea379a23 figure out slight diff from flash result bursteratom 2025-02-02 01:45:54 -05:00
  • 0ebab63309 test bursteratom 2025-02-02 01:27:15 -05:00
  • e98581f6f5 BLOCK SIZE bursteratom 2025-02-02 01:22:23 -05:00
  • b832b11c8f stuff bursteratom 2025-02-02 00:51:43 -05:00
  • b692d394b1 more test bursteratom 2025-02-02 00:48:57 -05:00
  • 2319e5276d more test bursteratom 2025-02-02 00:48:15 -05:00
  • 9a43a0925d more test bursteratom 2025-02-02 00:45:30 -05:00
  • 10de67e8ea more test bursteratom 2025-02-02 00:43:41 -05:00
  • fa7355404c test bursteratom 2025-02-02 00:38:35 -05:00
  • 907424a2e8 stuff bursteratom 2025-02-02 00:29:09 -05:00
  • 3f4fd3c1eb remove padding self attention Sunny Liu 2025-02-01 22:47:10 -05:00
  • f0abc453f1 Built site for gh-pages Quarto GHA Workflow Runner 2025-02-02 02:12:15 +00:00
  • 4cc660e3e2 Built site for gh-pages Quarto GHA Workflow Runner 2025-02-02 02:11:40 +00:00
  • 158330ab60 [feature] sweeps (#2171) Wing Lian 2025-02-01 21:11:18 -05:00
  • 80e1468b8d better handling of multipack dataset length (#2296) Wing Lian 2025-02-01 21:10:34 -05:00
  • 48c3c47071 vanills mask Sunny Liu 2025-02-01 14:23:37 -05:00
  • 3ed9c117fb try vanilla mask Sunny Liu 2025-02-01 14:09:13 -05:00
  • dda1a4798a Built site for gh-pages Quarto GHA Workflow Runner 2025-02-01 01:20:18 +00:00
  • 515590d726 Built site for gh-pages Quarto GHA Workflow Runner 2025-02-01 01:19:47 +00:00
  • a20f17689b set MODAL_IMAGE_BUILDER_VERSION=2024.10 to 2024.10 to test latest builder (#2302) Wing Lian 2025-01-31 20:19:20 -05:00
  • 78ce268848 KD Trainer w logprobs (#2303) Wing Lian 2025-01-31 20:18:52 -05:00
  • 5b56cc18d5 remove fastapi and pydantic extras modal-upgrade-builder Wing Lian 2025-01-31 11:17:42 -05:00
  • 596211d125 Built site for gh-pages Quarto GHA Workflow Runner 2025-01-31 13:59:17 +00:00
  • d425d5d3c3 fix: add warning for invalid eval_steps or save_steps (#2298) NanoCode012 2025-01-31 20:58:25 +07:00
  • cf17649ef3 Misc fixes 20250130 (#2301) Wing Lian 2025-01-31 08:58:04 -05:00
  • 5c3ac90669 chore: lint Wing Lian 2025-01-30 17:33:42 -05:00
  • 353ba4e80b set MODAL_IMAGE_BUILDER_VERSION=2024.10 to 2024.10 to test latest builder Wing Lian 2025-01-30 16:58:35 -05:00
  • 84960003ed reset llama_patch_multipack.py Sunny Liu 2025-01-30 14:40:18 -05:00
  • 93a268e43d --no-verify Sunny Liu 2025-01-30 14:08:26 -05:00
  • 065f6d477e flex batching WIP Sunny Liu 2025-01-30 14:04:59 -05:00
  • 51a1c6ea81 Built site for gh-pages Quarto GHA Workflow Runner 2025-01-30 17:50:15 +00:00
  • 6f294c3d8d refactor README; hardcode links to quarto docs; add additional quarto doc pages (#2295) Dan Saunders 2025-01-30 12:49:21 -05:00
  • 96ad741cd5 flex batching WIP Sunny Liu 2025-01-30 12:35:25 -05:00
  • 2eafa6ee7e Built site for gh-pages Quarto GHA Workflow Runner 2025-01-30 16:49:42 +00:00
  • 6f713226dd make save_safetensors: true the default (#2292) Wing Lian 2025-01-30 11:48:48 -05:00
  • d1e3ee9471 Built site for gh-pages Quarto GHA Workflow Runner 2025-01-30 16:47:03 +00:00
  • 1063d82b51 match the cuda version for 2.4.1 build w/o tmux (#2299) Wing Lian 2025-01-30 11:46:09 -05:00
  • ac471a697a updating to fused (#2293) salman 2025-01-30 16:45:56 +00:00
  • 8779997ba5 native support for modal cloud from CLI (#2237) Wing Lian 2025-01-30 11:34:02 -05:00
  • f11227a35a various fixes kd-trainer-zscore kd-trainer Wing Lian 2025-01-30 10:39:18 -05:00
  • c434951dd6 Always re-normalize teacher distribution Wing Lian 2025-01-29 08:36:40 -05:00
  • ba88bc7840 wip flex block mask creation bursteratom 2025-01-29 00:25:25 -05:00
  • 2e5c8430ff Built site for gh-pages Quarto GHA Workflow Runner 2025-01-29 05:11:13 +00:00
  • 268543a3be Ray Train Axolotl Integration (#2251) Eric Tang 2025-01-28 21:10:19 -08:00
  • 056e1b6315 Built site for gh-pages Quarto GHA Workflow Runner 2025-01-29 05:09:27 +00:00
  • 54dd7abfc1 Process reward models (#2241) salman 2025-01-29 05:08:33 +00:00