This website requires JavaScript.
45adf1bfb9
get_logger use_environ fix (#2808 )
Dan Saunders
2025-06-19 11:16:52 -04:00
0047516e46
Built site for gh-pages
Quarto GHA Workflow Runner
2025-06-18 20:04:40 +00:00
f3c8a25b30
Merge branch 'main' into codecov-pulls-only
codecov-pulls-only
Dan Saunders
2025-06-18 16:00:37 -04:00
eb3a57eb17
Ignore generation/endgeneration tags when analyzing Jinja chat template (#2787 )
Carsten Kragelund Jørgensen
2025-06-18 21:59:07 +02:00
37732063ea
Built site for gh-pages
Quarto GHA Workflow Runner
2025-06-18 19:52:00 +00:00
34da391391
Set dev version (#2807 ) [skip ci]
Wing Lian
2025-06-18 15:49:05 -04:00
88d1832ff5
Built site for gh-pages
Quarto GHA Workflow Runner
2025-06-18 19:48:08 +00:00
0bb9077553
Fix: logging on py310 (#2802 )
NanoCode012
2025-06-19 02:46:27 +07:00
a85efffbef
bump transformers==4.52.4 (#2800 ) [skip ci]
Wing Lian
2025-06-18 15:46:14 -04:00
06a648263b
Config doc autogen: follow-up fix docs build (#2806 )
Dan Saunders
2025-06-18 15:42:54 -04:00
9d5bfc127e
Config doc autogen (#2718 )
Dan Saunders
2025-06-18 15:36:53 -04:00
076b6e1e24
Revamp README.md with a fresh layout and enhanced content, including a new introduction, improved visual elements, and detailed sections on features, quick start guide, and comprehensive documentation. This update aims to create a more engaging and informative experience for users, highlighting Axolotl's capabilities in LLM fine-tuning.
feat/beautiful-readme
mhenrhcsen
2025-06-18 14:18:46 +02:00
b9db0cad1d
Enhance README.md with updated layout and content, including a new introduction, improved visual elements, and detailed sections on latest updates, features, quick start guide, and documentation. This update aims to provide a more engaging and informative experience for users.
mhenrhcsen
2025-06-18 14:06:14 +02:00
b1b81d070b
Built site for gh-pages
Quarto GHA Workflow Runner
2025-06-17 22:11:38 +00:00
da8f6c32b9
update favicon (#2801 )
Wing Lian
2025-06-17 18:09:24 -04:00
016eb8055f
accidental file
Dan Saunders
2025-06-17 13:58:02 -04:00
639ddeff6a
return codecov artifact from modal image
Dan Saunders
2025-06-17 13:33:02 -04:00
8ad79c2ce4
Built site for gh-pages
Quarto GHA Workflow Runner
2025-06-17 16:15:45 +00:00
88c0e8d048
release tag (#2799 )
v0.10.0
Wing Lian
2025-06-17 12:13:27 -04:00
2fa8566333
Built site for gh-pages
Quarto GHA Workflow Runner
2025-06-17 16:11:55 +00:00
d8e8cd8558
feat: remove evalfirst callback with built-in trainer arg (#2797 )
NanoCode012
2025-06-17 09:09:33 -07:00
ccc94da8ad
KD fix w/ online distillation (#2700 ) [skip ci]
Wing Lian
2025-06-17 12:09:13 -04:00
753e4e3dec
updates
Dan Saunders
2025-05-14 22:46:59 +00:00
2538c3b761
update to run only if succeeded
Dan Saunders
2025-05-13 14:31:16 +00:00
aa3639b7ad
run codecov action at end of CI; only_pulls: true
Dan Saunders
2025-05-12 17:14:42 +00:00
cbcc795bb3
commenting out unused
sdpa-cp
Dan Saunders
2025-06-16 01:53:13 +00:00
e34b6f4dfe
temp: trying another approach
Dan Saunders
2025-06-15 21:32:10 +00:00
75c3f17b3a
Built site for gh-pages
Quarto GHA Workflow Runner
2025-06-15 20:49:17 +00:00
ba62aa65ee
fixed the lora_target_modules syntax (#2793 )
Matt Cummins
2025-06-15 13:47:02 -07:00
5b66b8e86c
Built site for gh-pages
Quarto GHA Workflow Runner
2025-06-14 18:56:28 +00:00
21388cf615
Fix: lora kernel pre-patch applied despite post-patch not applied (#2772 )
NanoCode012
2025-06-14 11:54:06 -07:00
80d5b066ec
Fix: adding magistral fsdp config, fixing not eval with test_datasets, handle mllama attention (#2789 ) [skip ci]
NanoCode012
2025-06-14 11:53:43 -07:00
f8f87321bd
progress
Dan Saunders
2025-06-14 17:40:21 +00:00
a3c82e8cbb
fix: grpo doc link (#2788 ) [skip ci]
NanoCode012
2025-06-13 12:03:47 -07:00
84db47f3c0
Built site for gh-pages
Quarto GHA Workflow Runner
2025-06-13 14:02:59 +00:00
b2274d430b
support for QAT w RL (DPO) (#2776 )
Wing Lian
2025-06-13 10:00:35 -04:00
7a88de4fa8
finish basic impl; change naming from SP -> CP to match torch
Dan Saunders
2025-06-13 09:51:06 -04:00
7cd59362e8
Built site for gh-pages
Quarto GHA Workflow Runner
2025-06-12 23:20:55 +00:00
eac4a61f55
Feat: Add Magistral and mistral-common tokenizer support (#2780 )
NanoCode012
2025-06-12 16:18:33 -07:00
ace9287c96
update loss value for flakey e2e test (#2786 ) [skip ci]
Wing Lian
2025-06-12 18:06:14 -04:00
aced809989
progress (messy :O)
Dan Saunders
2025-06-12 18:54:41 +00:00
eac3a4860e
Built site for gh-pages
Quarto GHA Workflow Runner
2025-06-12 17:25:50 +00:00
f5fbc82f2b
Fix logging import in evaluate.py (#2782 ) (#2783 )
JZacaroli
2025-06-12 18:23:31 +01:00
706c677cad
feat(doc): update readme to include changelog and remove matrix (#2775 ) [skip ci]
NanoCode012
2025-06-12 10:23:18 -07:00
468580d18e
limit multipack sampler processes (#2771 ) [skip ci]
Wing Lian
2025-06-12 13:22:58 -04:00
3634d8ff9d
QAT docfix (#2778 ) [skip ci]
salman
2025-06-12 10:22:40 -07:00
bcc108efc1
build 2.7.1 images too (#2784 ) [skip ci]
Wing Lian
2025-06-12 13:22:20 -04:00
f465e840cc
Built site for gh-pages
Quarto GHA Workflow Runner
2025-06-11 21:13:29 +00:00
581dd324cc
build base images for torch 2.7.1 (#2764 )
Wing Lian
2025-06-11 17:11:06 -04:00
89d7105f8f
Built site for gh-pages
Quarto GHA Workflow Runner
2025-06-10 23:55:31 +00:00
00cda8cc70
Data loader refactor (#2707 )
Dan Saunders
2025-06-10 19:53:07 -04:00
15858cd29a
Built site for gh-pages
Quarto GHA Workflow Runner
2025-06-10 17:06:07 +00:00
52a0452acb
magistral small placeholder (#2777 )
Dan Saunders
2025-06-10 10:03:41 -07:00
a6056e35de
enable torch compile on the optimizer step
optimizer-compile
Wing Lian
2025-04-23 16:19:25 -04:00
e8e07e15d8
Built site for gh-pages
Quarto GHA Workflow Runner
2025-06-10 04:44:24 +00:00
83632f71d8
Feat: add tool calling support via tools column (#2774 )
NanoCode012
2025-06-09 21:42:05 -07:00
92afa4fa27
Fix the bug of position ids padding (#2739 ) [skip ci]
Qingyang Wu
2025-06-09 21:26:36 -07:00
dd660c2ed0
handle when unable to save optimizer state when using ao optimizer with FSDP (#2773 ) [skip ci]
Wing Lian
2025-06-09 21:26:14 -07:00
4f39aeefb9
debug
mistral-support
Dan Saunders
2025-06-09 20:38:46 +00:00
8f75136ad3
debug
Dan Saunders
2025-06-09 20:38:13 +00:00
70e9cb545d
update mistral dep version
Dan Saunders
2025-06-09 18:01:40 +00:00
aa236a4669
use from_hf_hub
Dan Saunders
2025-06-09 01:42:48 +00:00
65f8988efd
small changes
Dan Saunders
2025-06-05 22:36:46 +00:00
13ddb8f172
Simplify mistral tokenizer identification (depends on upstream PR)
Dan Saunders
2025-06-05 07:00:50 +00:00
b1570ed0fa
update
Dan Saunders
2025-05-29 20:04:35 +00:00
9581a9efed
refactor tokenizer loader + add mistral logic
Dan Saunders
2025-05-28 23:21:28 +00:00
7e44445494
add mistral-common dep
Dan Saunders
2025-05-27 17:07:22 +00:00
5c8a0d0f82
Built site for gh-pages
Quarto GHA Workflow Runner
2025-06-09 06:16:30 +00:00
09c685fd2c
fix worker_init_fn signature handling (#2769 )
Wing Lian
2025-06-08 23:14:10 -07:00
ae73123eae
progress; move validation to pydantic model config
Dan Saunders
2025-06-07 06:58:59 +00:00
2491303c46
improve handling of train len
kd-fix-20250519-v2
Wing Lian
2025-06-06 22:07:29 -07:00
345a159796
coderabbit comments
telemetry-opt-in
Dan Saunders
2025-06-07 04:50:29 +00:00
10d1e44943
SDPA context parallel
Dan Saunders
2025-06-06 00:34:12 +00:00
657bffd85f
update posthog dep
Dan Saunders
2025-06-05 23:46:20 +00:00
f0dde8e2d5
lint
Dan Saunders
2025-06-05 23:41:46 +00:00
25fa4df70f
fix
Dan Saunders
2025-03-05 16:41:24 +00:00
e735f4270b
slight changes
Dan Saunders
2025-03-05 16:25:49 +00:00
035e7a2f4c
simplifying
Dan Saunders
2025-02-28 10:19:40 -05:00
2d36c11264
minor fixes
Dan Saunders
2025-02-27 15:37:00 -05:00
b8ec5bdccf
doc update
Dan Saunders
2025-02-26 20:40:12 +00:00
249405b46e
docs fix
Dan Saunders
2025-02-26 17:56:46 +00:00
d3be84fec2
enable / disable logic update
Dan Saunders
2025-02-26 17:52:53 +00:00
1c74ab175f
opt-in version of telemetry
Dan Saunders
2025-02-26 11:13:38 -05:00
b2f1fc109a
distributed fix
Dan Saunders
2025-02-26 02:55:44 +00:00
5a2a80cc48
fix issue with tests in ci
Dan Saunders
2025-02-24 21:30:34 +00:00
4033fe74f8
fixes
Dan Saunders
2025-02-24 20:30:16 +00:00
e9df4444be
remove duplicate info
Dan Saunders
2025-02-24 20:02:16 +00:00
ffd2985750
adding runtime metrics / system info additional accelerator support, etc.
Dan Saunders
2025-02-24 19:37:11 +00:00
17310f9acc
adding runtime metrics / system info additional accelerator support, etc.
Dan Saunders
2025-02-24 19:36:31 +00:00
71ae6f9f87
improved redaction, send system info during model config load telemetry, etc.
Dan Saunders
2025-02-24 15:39:02 +00:00
9dd1092f8f
doc update
Dan Saunders
2025-02-24 01:49:31 +00:00
2c2f2647a9
fix
Dan Saunders
2025-02-24 01:31:35 +00:00
98313a6b3f
adding back in base_model redaction w/ whitelist
Dan Saunders
2025-02-24 01:16:03 +00:00
8b75205d3b
sleep on all ranks in distributed setting
Dan Saunders
2025-02-24 00:53:58 +00:00
ef4990f304
simplifying path redaction
Dan Saunders
2025-02-24 00:06:08 +00:00
db3297b090
small update / fix
Dan Saunders
2025-02-21 20:35:09 +00:00
86ed554bda
tests for runtime metrics telemetry and assoc. callback
Dan Saunders
2025-02-21 20:31:07 +00:00
f254d7d5a2
adding runtime metrics (cpu + gpu memory, steps/s, etc.)
Dan Saunders
2025-02-21 19:01:35 +00:00
d8b0522ea0
updated sanitization logic, tests
Dan Saunders
2025-02-24 20:05:55 +00:00
1edd6b9524
update error file path sanitization function; adding more error tracking
Dan Saunders
2025-02-21 13:57:08 +00:00