Commit Graph

  • 2c66483a47 default to dropping last batch in multipack batch sampler Wing Lian 2025-06-05 16:00:24 -07:00
  • 66c6fb56cb progress on telemetry: config load, process, model load, train start / end, error tracking Dan Saunders 2025-02-19 22:05:12 +00:00
  • 90b39ce112 updates Dan Saunders 2025-02-19 13:55:04 +00:00
  • 5afab46cc6 updates Dan Saunders 2025-02-19 13:54:51 +00:00
  • bd152c6115 adding todo Dan Saunders 2025-02-17 22:25:54 +00:00
  • 76336743ff initial telemetry manager impl Dan Saunders 2025-02-17 18:31:42 +00:00
  • 01382b9a79 fix rebase issues Wing Lian 2025-06-05 15:31:28 -07:00
  • cfcd69df0d rename vars for consistency Wing Lian 2025-06-04 11:58:20 -07:00
  • 2302b14a84 fix to remove attention_mask Wing Lian 2025-06-01 08:15:56 -04:00
  • a8e2bddd19 increase hyperparams_count for gradients for added normalize_topk Wing Lian 2025-05-31 08:42:06 -04:00
  • d55a51623f more KD updates Wing Lian 2025-05-31 08:26:29 -04:00
  • 73a84ad0dd post-rebase lint Wing Lian 2025-05-28 10:35:43 -04:00
  • 3cffe881bb accept compressed responses for smaller wire payload Wing Lian 2025-05-28 10:01:25 -04:00
  • e77d62933d Fix decay Wing Lian 2025-05-28 08:19:52 -04:00
  • 3a0faa97ca fix trainer callback base class Wing Lian 2025-05-28 00:46:52 -04:00
  • 20602fd93f chore: lint Wing Lian 2025-05-27 19:26:31 -04:00
  • 770bb0605a support for dynamic plugin training args mixins and symmetric kl Wing Lian 2025-05-27 17:49:33 -04:00
  • 24b96b1c4f temp scale kd loss at end Wing Lian 2025-05-26 23:52:29 -04:00
  • 90c7228ff9 use max not min Wing Lian 2025-05-26 22:34:07 -04:00
  • 9eb53f5c9e fix length of padding Wing Lian 2025-05-26 22:28:26 -04:00
  • 225b420dc5 shift off the first empty token Wing Lian 2025-05-26 22:23:50 -04:00
  • b75db13615 fix check Wing Lian 2025-05-26 21:32:11 -04:00
  • c7b1db329e logsumexp trick: Wing Lian 2025-05-26 21:12:19 -04:00
  • a40e484803 handle when no custom collator is used in plugins Wing Lian 2025-05-25 18:01:46 -04:00
  • 9899c924f9 suport sampling params/max new tokens Wing Lian 2025-05-25 17:00:12 -04:00
  • 505009b454 add close to comment block Wing Lian 2025-05-25 16:53:20 -04:00
  • b4e96ef12c online kd wip Wing Lian 2025-05-25 16:52:20 -04:00
  • a8d9fab635 don't need temp arg to distill method Wing Lian 2025-05-23 10:42:25 -04:00
  • 49e2fa825d additional plugin collator kwargs, don't scale up kd loss by t^2 Wing Lian 2025-05-23 08:35:44 -04:00
  • 7263845207 remove debugging Wing Lian 2025-05-22 15:00:42 -04:00
  • 5ccfd225cb collator cls for plugins Wing Lian 2025-05-22 10:13:44 -04:00
  • 28eb8632a1 more fixes and liger-type chunked loss Wing Lian 2025-05-22 07:58:59 -04:00
  • 5cfaac3767 WIP chunked KD loss with autograd wrapper Wing Lian 2025-05-21 12:24:46 -07:00
  • ca70fb7cb0 simplfy and remove zscore Wing Lian 2025-05-20 13:00:19 -07:00
  • 22b50d6619 drop top_k before softmax Wing Lian 2025-05-20 12:43:23 -07:00
  • a2248673d8 kd trainer has kd temp as part of the init Wing Lian 2025-05-20 09:07:13 -07:00
  • 0399aefcb3 better handling to drop string fields for kd with raw dataset Wing Lian 2025-05-20 08:49:23 -07:00
  • 83ad248e5b fix input args Wing Lian 2025-05-20 07:34:41 -07:00
  • 6fafe46562 fix collator setup Wing Lian 2025-05-20 07:33:20 -07:00
  • 0e46367e01 kd fixes Wing Lian 2025-05-19 15:25:15 -07:00
  • 7909bfb076 add manual seed for flaky test_geglu_backward test (#2763) [skip ci] Wing Lian 2025-06-05 09:23:17 -07:00
  • afd6ed125d Built site for gh-pages Quarto GHA Workflow Runner 2025-06-05 14:27:31 +00:00
  • cb03c765a1 add uv tooling for e2e gpu tests (#2750) Wing Lian 2025-06-05 07:25:06 -07:00
  • 4440b4a1ce remove unused field for chat_template.default for DPO training (#2755) [skip ci] Timofey Klyubin 2025-06-05 10:22:58 -04:00
  • e8e45b3441 fix: remove hqq (#2759) [skip ci] NanoCode012 2025-06-05 07:22:23 -07:00
  • c67910fa6f bump hf deps (#2735) [skip ci] Wing Lian 2025-06-05 07:20:33 -07:00
  • aeeaf17fbd Built site for gh-pages Quarto GHA Workflow Runner 2025-06-03 21:29:32 +00:00
  • 787880215b fix(deepspeed): deepspeed config not being set for z3 (#2754) NanoCode012 2025-06-03 14:27:09 -07:00
  • 4b1a29c694 feat(modal): update docker tag to use torch2.6 from torch2.5 (#2749) [skip ci] NanoCode012 2025-06-03 14:26:07 -07:00
  • d7fa60662e feat: add chat_template kwargs (#2694) [skip ci] NanoCode012 2025-06-03 14:25:26 -07:00
  • d8ba272b05 Built site for gh-pages Quarto GHA Workflow Runner 2025-06-03 21:06:39 +00:00
  • 1d91d905c9 remove deprecated wandb env var (#2751) Dan Saunders 2025-06-03 16:04:15 -05:00
  • 9fc08a1f2d Built site for gh-pages Quarto GHA Workflow Runner 2025-06-03 19:33:03 +00:00
  • 2bf61d8e25 fix abbriviatation spelling error mhenrhcsen 2025-06-01 22:50:17 +02:00
  • 68788e419e feat: add Group Relative Policy Optimization (GPRO) to RLHF documentation mhenrhcsen 2025-06-01 22:42:03 +02:00
  • a219b16c0b Built site for gh-pages Quarto GHA Workflow Runner 2025-06-02 22:56:51 +00:00
  • 94219f6ee8 chore: update pre-commit hooks (#2745) github-actions[bot] 2025-06-02 15:54:29 -07:00
  • bb1c1ba452 Built site for gh-pages Quarto GHA Workflow Runner 2025-06-02 19:51:17 +00:00
  • ecc719f5c7 add support for base image with uv (#2691) Wing Lian 2025-06-02 12:48:55 -07:00
  • eba183b322 Built site for gh-pages Quarto GHA Workflow Runner 2025-05-31 05:15:59 +00:00
  • d5d0dc5938 fix: suppress non-axolotl logs unless it's warning or higher (#2724) NanoCode012 2025-05-31 12:13:43 +07:00
  • 5e86c35322 fix(log): remove duplicate merge_lora param (#2742) [skip ci] NanoCode012 2025-05-31 12:13:31 +07:00
  • 9304e18f4b Built site for gh-pages Quarto GHA Workflow Runner 2025-05-30 04:24:18 +00:00
  • 6778856804 Fix: RL base feature parity (#2133) NanoCode012 2025-05-30 11:21:47 +07:00
  • dd36fe4391 Built site for gh-pages Quarto GHA Workflow Runner 2025-05-28 20:22:40 +00:00
  • ec4ebfd997 Add a few items to faq (#2734) Wing Lian 2025-05-28 16:20:19 -04:00
  • fbfe2820af Built site for gh-pages Quarto GHA Workflow Runner 2025-05-28 19:02:18 +00:00
  • bde8b5b6bd fix dist state init before deepspeed setup (#2737) Dan Saunders 2025-05-28 14:59:57 -04:00
  • 1211da1da2 Built site for gh-pages Quarto GHA Workflow Runner 2025-05-28 14:06:06 +00:00
  • 2962a398b7 Lora kernels fix (#2732) Dan Saunders 2025-05-28 10:03:43 -04:00
  • 841f9f9b6c Built site for gh-pages Quarto GHA Workflow Runner 2025-05-28 13:59:53 +00:00
  • 65c5481120 Rank 0-only logging (#2608) salman 2025-05-28 14:57:30 +01:00
  • 77fe5c386d Built site for gh-pages Quarto GHA Workflow Runner 2025-05-28 11:38:04 +00:00
  • 5fca214108 QAT (#2590) salman 2025-05-28 12:35:47 +01:00
  • 505e4cba9f Built site for gh-pages Quarto GHA Workflow Runner 2025-05-28 08:53:35 +00:00
  • 20fda75917 feat(doc): add google analytics to docs (#2708) NanoCode012 2025-05-28 15:51:21 +07:00
  • 6b6370f4e3 feat(doc): add info on how to use dapo / dr grpo and misc doc fixes (#2673) [skip ci] NanoCode012 2025-05-28 15:51:04 +07:00
  • add2025253 Fix Mistral chat template (mistral_v7_tekken) (#2710) [skip ci] mashdragon 2025-05-28 08:50:47 +00:00
  • a703560a10 add two checks to handle legacy format interleaved multimodal ds (#2721) [skip ci] artem 2025-05-28 01:49:43 -07:00
  • 4a80d309e8 Add chat templates for command-a and aya-23-8B models (#2731) [skip ci] NOHHYEOB, BAE 2025-05-28 17:49:16 +09:00
  • e33f225434 feat(doc): note lora kernel incompat with RLHF (#2706) [skip ci] NanoCode012 2025-05-28 15:48:40 +07:00
  • 3e6948be97 Fix(doc): clarify data loading for local datasets and splitting samples (#2726) [skip ci] NanoCode012 2025-05-28 15:48:22 +07:00
  • e0b7c7802f Built site for gh-pages Quarto GHA Workflow Runner 2025-05-27 15:47:53 +00:00
  • 4a8af60d34 chore: update pre-commit hooks (#2729) github-actions[bot] 2025-05-27 11:45:31 -04:00
  • a0941a9271 no need to generate diff file (#2728) Dan Saunders 2025-05-27 11:44:06 -04:00
  • 618b008e36 Merge branch 'main' into 775-option-to-drop-vs-truncate-on-rows-longer-than-context-length mhenrichsen 2025-05-27 12:31:31 +02:00
  • 2f39b2acca Built site for gh-pages Quarto GHA Workflow Runner 2025-05-24 01:19:13 +00:00
  • 5eb01f3df1 Fix quarto (#2717) Dan Saunders 2025-05-23 21:16:51 -04:00
  • d27c35ac44 Liger GraniteMoE (#2715) xzuyn 2025-05-23 18:40:43 -04:00
  • a535b68043 update quarto for model loading refactor (#2716) Dan Saunders 2025-05-23 16:28:31 -04:00
  • 30981328fc draft config for devstral devstral-support Dan Saunders 2025-05-23 20:04:21 +00:00
  • b5f1e53a0f models.py -> loaders/ module refactor (#2680) Dan Saunders 2025-05-23 15:51:11 -04:00
  • 6dfc0290e8 Built site for gh-pages Quarto GHA Workflow Runner 2025-05-23 16:29:51 +00:00
  • 8cde256db2 Remove unused const (#2714) Dan Saunders 2025-05-23 12:27:38 -04:00
  • 1f5c0d3613 fix graph break for compile sac Wing Lian 2025-05-23 11:50:37 -04:00
  • 3ae0f7c08e make sure torch_compile is enabled with SAC Wing Lian 2025-05-23 11:15:44 -04:00
  • 9f1d548534 don't use zero first context for loading datasets no-zero-ds-train Wing Lian 2025-05-23 10:38:32 -04:00
  • 5930c91a12 add support for SAC Wing Lian 2025-05-23 10:33:02 -04:00
  • b7e6d945e9 Built site for gh-pages Quarto GHA Workflow Runner 2025-05-22 15:20:58 +00:00
  • 5f8f817200 SP context manager update (#2699) Dan Saunders 2025-05-22 11:18:32 -04:00