Commit Graph

  • 45adf1bfb9 get_logger use_environ fix (#2808) Dan Saunders 2025-06-19 11:16:52 -04:00
  • 0047516e46 Built site for gh-pages Quarto GHA Workflow Runner 2025-06-18 20:04:40 +00:00
  • f3c8a25b30 Merge branch 'main' into codecov-pulls-only codecov-pulls-only Dan Saunders 2025-06-18 16:00:37 -04:00
  • eb3a57eb17 Ignore generation/endgeneration tags when analyzing Jinja chat template (#2787) Carsten Kragelund Jørgensen 2025-06-18 21:59:07 +02:00
  • 37732063ea Built site for gh-pages Quarto GHA Workflow Runner 2025-06-18 19:52:00 +00:00
  • 34da391391 Set dev version (#2807) [skip ci] Wing Lian 2025-06-18 15:49:05 -04:00
  • 88d1832ff5 Built site for gh-pages Quarto GHA Workflow Runner 2025-06-18 19:48:08 +00:00
  • 0bb9077553 Fix: logging on py310 (#2802) NanoCode012 2025-06-19 02:46:27 +07:00
  • a85efffbef bump transformers==4.52.4 (#2800) [skip ci] Wing Lian 2025-06-18 15:46:14 -04:00
  • 06a648263b Config doc autogen: follow-up fix docs build (#2806) Dan Saunders 2025-06-18 15:42:54 -04:00
  • 9d5bfc127e Config doc autogen (#2718) Dan Saunders 2025-06-18 15:36:53 -04:00
  • 076b6e1e24 Revamp README.md with a fresh layout and enhanced content, including a new introduction, improved visual elements, and detailed sections on features, quick start guide, and comprehensive documentation. This update aims to create a more engaging and informative experience for users, highlighting Axolotl's capabilities in LLM fine-tuning. feat/beautiful-readme mhenrhcsen 2025-06-18 14:18:46 +02:00
  • b9db0cad1d Enhance README.md with updated layout and content, including a new introduction, improved visual elements, and detailed sections on latest updates, features, quick start guide, and documentation. This update aims to provide a more engaging and informative experience for users. mhenrhcsen 2025-06-18 14:06:14 +02:00
  • b1b81d070b Built site for gh-pages Quarto GHA Workflow Runner 2025-06-17 22:11:38 +00:00
  • da8f6c32b9 update favicon (#2801) Wing Lian 2025-06-17 18:09:24 -04:00
  • 016eb8055f accidental file Dan Saunders 2025-06-17 13:58:02 -04:00
  • 639ddeff6a return codecov artifact from modal image Dan Saunders 2025-06-17 13:33:02 -04:00
  • 8ad79c2ce4 Built site for gh-pages Quarto GHA Workflow Runner 2025-06-17 16:15:45 +00:00
  • 88c0e8d048 release tag (#2799) v0.10.0 Wing Lian 2025-06-17 12:13:27 -04:00
  • 2fa8566333 Built site for gh-pages Quarto GHA Workflow Runner 2025-06-17 16:11:55 +00:00
  • d8e8cd8558 feat: remove evalfirst callback with built-in trainer arg (#2797) NanoCode012 2025-06-17 09:09:33 -07:00
  • ccc94da8ad KD fix w/ online distillation (#2700) [skip ci] Wing Lian 2025-06-17 12:09:13 -04:00
  • 753e4e3dec updates Dan Saunders 2025-05-14 22:46:59 +00:00
  • 2538c3b761 update to run only if succeeded Dan Saunders 2025-05-13 14:31:16 +00:00
  • aa3639b7ad run codecov action at end of CI; only_pulls: true Dan Saunders 2025-05-12 17:14:42 +00:00
  • cbcc795bb3 commenting out unused sdpa-cp Dan Saunders 2025-06-16 01:53:13 +00:00
  • e34b6f4dfe temp: trying another approach Dan Saunders 2025-06-15 21:32:10 +00:00
  • 75c3f17b3a Built site for gh-pages Quarto GHA Workflow Runner 2025-06-15 20:49:17 +00:00
  • ba62aa65ee fixed the lora_target_modules syntax (#2793) Matt Cummins 2025-06-15 13:47:02 -07:00
  • 5b66b8e86c Built site for gh-pages Quarto GHA Workflow Runner 2025-06-14 18:56:28 +00:00
  • 21388cf615 Fix: lora kernel pre-patch applied despite post-patch not applied (#2772) NanoCode012 2025-06-14 11:54:06 -07:00
  • 80d5b066ec Fix: adding magistral fsdp config, fixing not eval with test_datasets, handle mllama attention (#2789) [skip ci] NanoCode012 2025-06-14 11:53:43 -07:00
  • f8f87321bd progress Dan Saunders 2025-06-14 17:40:21 +00:00
  • a3c82e8cbb fix: grpo doc link (#2788) [skip ci] NanoCode012 2025-06-13 12:03:47 -07:00
  • 84db47f3c0 Built site for gh-pages Quarto GHA Workflow Runner 2025-06-13 14:02:59 +00:00
  • b2274d430b support for QAT w RL (DPO) (#2776) Wing Lian 2025-06-13 10:00:35 -04:00
  • 7a88de4fa8 finish basic impl; change naming from SP -> CP to match torch Dan Saunders 2025-06-13 09:51:06 -04:00
  • 7cd59362e8 Built site for gh-pages Quarto GHA Workflow Runner 2025-06-12 23:20:55 +00:00
  • eac4a61f55 Feat: Add Magistral and mistral-common tokenizer support (#2780) NanoCode012 2025-06-12 16:18:33 -07:00
  • ace9287c96 update loss value for flakey e2e test (#2786) [skip ci] Wing Lian 2025-06-12 18:06:14 -04:00
  • aced809989 progress (messy :O) Dan Saunders 2025-06-12 18:54:41 +00:00
  • eac3a4860e Built site for gh-pages Quarto GHA Workflow Runner 2025-06-12 17:25:50 +00:00
  • f5fbc82f2b Fix logging import in evaluate.py (#2782) (#2783) JZacaroli 2025-06-12 18:23:31 +01:00
  • 706c677cad feat(doc): update readme to include changelog and remove matrix (#2775) [skip ci] NanoCode012 2025-06-12 10:23:18 -07:00
  • 468580d18e limit multipack sampler processes (#2771) [skip ci] Wing Lian 2025-06-12 13:22:58 -04:00
  • 3634d8ff9d QAT docfix (#2778) [skip ci] salman 2025-06-12 10:22:40 -07:00
  • bcc108efc1 build 2.7.1 images too (#2784) [skip ci] Wing Lian 2025-06-12 13:22:20 -04:00
  • f465e840cc Built site for gh-pages Quarto GHA Workflow Runner 2025-06-11 21:13:29 +00:00
  • 581dd324cc build base images for torch 2.7.1 (#2764) Wing Lian 2025-06-11 17:11:06 -04:00
  • 89d7105f8f Built site for gh-pages Quarto GHA Workflow Runner 2025-06-10 23:55:31 +00:00
  • 00cda8cc70 Data loader refactor (#2707) Dan Saunders 2025-06-10 19:53:07 -04:00
  • 15858cd29a Built site for gh-pages Quarto GHA Workflow Runner 2025-06-10 17:06:07 +00:00
  • 52a0452acb magistral small placeholder (#2777) Dan Saunders 2025-06-10 10:03:41 -07:00
  • a6056e35de enable torch compile on the optimizer step optimizer-compile Wing Lian 2025-04-23 16:19:25 -04:00
  • e8e07e15d8 Built site for gh-pages Quarto GHA Workflow Runner 2025-06-10 04:44:24 +00:00
  • 83632f71d8 Feat: add tool calling support via tools column (#2774) NanoCode012 2025-06-09 21:42:05 -07:00
  • 92afa4fa27 Fix the bug of position ids padding (#2739) [skip ci] Qingyang Wu 2025-06-09 21:26:36 -07:00
  • dd660c2ed0 handle when unable to save optimizer state when using ao optimizer with FSDP (#2773) [skip ci] Wing Lian 2025-06-09 21:26:14 -07:00
  • 4f39aeefb9 debug mistral-support Dan Saunders 2025-06-09 20:38:46 +00:00
  • 8f75136ad3 debug Dan Saunders 2025-06-09 20:38:13 +00:00
  • 70e9cb545d update mistral dep version Dan Saunders 2025-06-09 18:01:40 +00:00
  • aa236a4669 use from_hf_hub Dan Saunders 2025-06-09 01:42:48 +00:00
  • 65f8988efd small changes Dan Saunders 2025-06-05 22:36:46 +00:00
  • 13ddb8f172 Simplify mistral tokenizer identification (depends on upstream PR) Dan Saunders 2025-06-05 07:00:50 +00:00
  • b1570ed0fa update Dan Saunders 2025-05-29 20:04:35 +00:00
  • 9581a9efed refactor tokenizer loader + add mistral logic Dan Saunders 2025-05-28 23:21:28 +00:00
  • 7e44445494 add mistral-common dep Dan Saunders 2025-05-27 17:07:22 +00:00
  • 5c8a0d0f82 Built site for gh-pages Quarto GHA Workflow Runner 2025-06-09 06:16:30 +00:00
  • 09c685fd2c fix worker_init_fn signature handling (#2769) Wing Lian 2025-06-08 23:14:10 -07:00
  • ae73123eae progress; move validation to pydantic model config Dan Saunders 2025-06-07 06:58:59 +00:00
  • 2491303c46 improve handling of train len kd-fix-20250519-v2 Wing Lian 2025-06-06 22:07:29 -07:00
  • 345a159796 coderabbit comments telemetry-opt-in Dan Saunders 2025-06-07 04:50:29 +00:00
  • 10d1e44943 SDPA context parallel Dan Saunders 2025-06-06 00:34:12 +00:00
  • 657bffd85f update posthog dep Dan Saunders 2025-06-05 23:46:20 +00:00
  • f0dde8e2d5 lint Dan Saunders 2025-06-05 23:41:46 +00:00
  • 25fa4df70f fix Dan Saunders 2025-03-05 16:41:24 +00:00
  • e735f4270b slight changes Dan Saunders 2025-03-05 16:25:49 +00:00
  • 035e7a2f4c simplifying Dan Saunders 2025-02-28 10:19:40 -05:00
  • 2d36c11264 minor fixes Dan Saunders 2025-02-27 15:37:00 -05:00
  • b8ec5bdccf doc update Dan Saunders 2025-02-26 20:40:12 +00:00
  • 249405b46e docs fix Dan Saunders 2025-02-26 17:56:46 +00:00
  • d3be84fec2 enable / disable logic update Dan Saunders 2025-02-26 17:52:53 +00:00
  • 1c74ab175f opt-in version of telemetry Dan Saunders 2025-02-26 11:13:38 -05:00
  • b2f1fc109a distributed fix Dan Saunders 2025-02-26 02:55:44 +00:00
  • 5a2a80cc48 fix issue with tests in ci Dan Saunders 2025-02-24 21:30:34 +00:00
  • 4033fe74f8 fixes Dan Saunders 2025-02-24 20:30:16 +00:00
  • e9df4444be remove duplicate info Dan Saunders 2025-02-24 20:02:16 +00:00
  • ffd2985750 adding runtime metrics / system info additional accelerator support, etc. Dan Saunders 2025-02-24 19:37:11 +00:00
  • 17310f9acc adding runtime metrics / system info additional accelerator support, etc. Dan Saunders 2025-02-24 19:36:31 +00:00
  • 71ae6f9f87 improved redaction, send system info during model config load telemetry, etc. Dan Saunders 2025-02-24 15:39:02 +00:00
  • 9dd1092f8f doc update Dan Saunders 2025-02-24 01:49:31 +00:00
  • 2c2f2647a9 fix Dan Saunders 2025-02-24 01:31:35 +00:00
  • 98313a6b3f adding back in base_model redaction w/ whitelist Dan Saunders 2025-02-24 01:16:03 +00:00
  • 8b75205d3b sleep on all ranks in distributed setting Dan Saunders 2025-02-24 00:53:58 +00:00
  • ef4990f304 simplifying path redaction Dan Saunders 2025-02-24 00:06:08 +00:00
  • db3297b090 small update / fix Dan Saunders 2025-02-21 20:35:09 +00:00
  • 86ed554bda tests for runtime metrics telemetry and assoc. callback Dan Saunders 2025-02-21 20:31:07 +00:00
  • f254d7d5a2 adding runtime metrics (cpu + gpu memory, steps/s, etc.) Dan Saunders 2025-02-21 19:01:35 +00:00
  • d8b0522ea0 updated sanitization logic, tests Dan Saunders 2025-02-24 20:05:55 +00:00
  • 1edd6b9524 update error file path sanitization function; adding more error tracking Dan Saunders 2025-02-21 13:57:08 +00:00