Commit Graph

2703 Commits

Author SHA1 Message Date
Wing Lian
e2f69828d2 [fix][fsdp2] clone sharded param so original full size shard can be gc'ed (#3597) [skip ci] 2026-04-11 20:22:35 -04:00
Wing Lian
122b50bad6 pre-cache the eot token ids rather than on each iteration (#3594) [skip ci] 2026-04-11 20:05:21 -04:00
Wing Lian
e77a185e86 upgrade transformers to use v5.5.3 (#3593) 2026-04-10 17:08:14 -04:00
Wing Lian
29fa4dedbb Gemma4 fixes and profiler (#3591) 2026-04-10 16:46:17 -04:00
Wing Lian
315cdeede9 handle trainable/masked spans in content and reasoning content (#3592) 2026-04-10 14:11:10 -04:00
NanoCode012
e7a6a5b529 fix: move warning after we've set any overrides (#3589) [skip ci] 2026-04-10 13:00:47 -04:00
NanoCode012
bfb4da1d25 fix: document jinja2 file path support (#3588) [skip ci] 2026-04-10 13:00:26 -04:00
floaty3
4dfa0a59b2 Add uninstall command to cut_cross_entropy import message (#3583) [skip ci] 2026-04-10 13:00:07 -04:00
Wing Lian
4ef608dda3 fix ddp/fsdp w gemma4 (#3584)
* fix ddp/fsdp w gemma4

* address pr comments

* activation offloading fix and update agent docs for gemma4
2026-04-09 20:02:36 -07:00
NanoCode012
7daf7d96f1 fix: regex for unfrozen language tower (#3586) [skip ci]
* fix: regex for unfrozen language tower

* fix: other leftover regex
2026-04-08 08:18:11 -07:00
Wing Lian
7c56809c7f use vllm 0.19.0 for torch 2.10.0 (#3582) 2026-04-07 08:09:49 -07:00
NanoCode012
149178ddb7 chore: cleanup post release v0.16 (#3577)
* fix: remove unneeded debug log

* fix: cleanup

* feat: add dense gemma config and cleanup

* feat: add cce support

* update notes and set torch compile

* fix patch for new number of return vals

* fixes for gemma4

* fix packing bug

* use updated cce for mm

* fix: pass in kv cache func when avail for transformers 5.5

* feat: update examples with flex variant and readme

* gemma4 lora attention kernels

---------

Co-authored-by: Wing Lian <wing.lian@gmail.com>
Co-authored-by: Wing Lian <wing@axolotl.ai>
2026-04-06 10:10:52 -07:00
NanoCode012
dc638e723f fix(config): add cce and liger to nemotron-h example (#3573) [skip ci] 2026-04-06 10:10:25 -07:00
Wing Lian
6f15da4cac make it easier for agents to discover docs (#3579) [skip ci]
* make it easier for agents to discover docs

* fixup pr comments
2026-04-06 10:00:55 -07:00
Maxime
900eec7988 Fix DO_NOT_TRACK not being correctly handled (#3580)
* Fix DO_NOT_TRACK not being correctly handled

* add unit tests and lint

---------

Co-authored-by: Wing Lian <wing@axolotl.ai>
2026-04-04 05:16:58 -04:00
Wing Lian
08fc7de87e gemma4 support (#3574)
Some checks failed
ci-cd / build-axolotl (<nil>, 128, 12.8.1, linux/amd64,linux/arm64, 3.11, 2.9.0) (push) Has been cancelled
ci-cd / build-axolotl (<nil>, 128, 12.8.1, linux/amd64,linux/arm64, 3.12, 2.10.0) (push) Has been cancelled
ci-cd / build-axolotl (<nil>, 128, 12.8.1, true, linux/amd64,linux/arm64, 3.11, 2.9.1) (push) Has been cancelled
ci-cd / build-axolotl (<nil>, 130, 13.0.0, linux/amd64,linux/arm64, 3.11, 2.9.1) (push) Has been cancelled
ci-cd / build-axolotl (<nil>, 130, 13.0.0, linux/amd64,linux/arm64, 3.12, 2.10.0) (push) Has been cancelled
ci-cd / build-axolotl-uv (<nil>, 128, 12.8.1, linux/amd64,linux/arm64, 3.11, 2.9.1) (push) Has been cancelled
ci-cd / build-axolotl-uv (<nil>, 128, 12.8.1, linux/amd64,linux/arm64, 3.12, 2.10.0) (push) Has been cancelled
ci-cd / build-axolotl-uv (<nil>, 128, 12.8.1, true, linux/amd64,linux/arm64, 3.12, 2.9.1) (push) Has been cancelled
ci-cd / build-axolotl-uv (<nil>, 130, 13.0.0, linux/amd64,linux/arm64, 3.11, 2.9.1) (push) Has been cancelled
ci-cd / build-axolotl-uv (<nil>, 130, 13.0.0, linux/amd64,linux/arm64, 3.12, 2.10.0) (push) Has been cancelled
publish pypi / Create Release (push) Has been cancelled
ci-cd / build-axolotl-cloud (<nil>, 128, 12.8.1, linux/amd64,linux/arm64, 3.11, 2.9.0) (push) Has been cancelled
ci-cd / build-axolotl-cloud (<nil>, 128, 12.8.1, linux/amd64,linux/arm64, 3.12, 2.10.0) (push) Has been cancelled
ci-cd / build-axolotl-cloud (<nil>, 128, 12.8.1, true, linux/amd64,linux/arm64, 3.11, 2.9.1) (push) Has been cancelled
ci-cd / build-axolotl-cloud (<nil>, 130, 13.0.0, linux/amd64,linux/arm64, 3.11, 2.9.1) (push) Has been cancelled
ci-cd / build-axolotl-cloud (<nil>, 130, 13.0.0, linux/amd64,linux/arm64, 3.12, 2.10.0) (push) Has been cancelled
ci-cd / build-axolotl-cloud-uv (<nil>, 128, 12.8.1, linux/amd64,linux/arm64, 3.11, 2.9.1) (push) Has been cancelled
ci-cd / build-axolotl-cloud-uv (<nil>, 128, 12.8.1, linux/amd64,linux/arm64, 3.12, 2.10.0) (push) Has been cancelled
ci-cd / build-axolotl-cloud-uv (<nil>, 128, 12.8.1, true, linux/amd64,linux/arm64, 3.12, 2.9.1) (push) Has been cancelled
ci-cd / build-axolotl-cloud-uv (<nil>, 130, 13.0.0, linux/amd64,linux/arm64, 3.11, 2.9.1) (push) Has been cancelled
ci-cd / build-axolotl-cloud-uv (<nil>, 130, 13.0.0, linux/amd64,linux/arm64, 3.12, 2.10.0) (push) Has been cancelled
ci-cd / build-axolotl-cloud-no-tmux (<nil>, 128, 12.8.1, true, 3.11, 2.9.1) (push) Has been cancelled
ci-cd / build-axolotl-cloud-no-tmux (<nil>, 130, 13.0.0, <nil>, 3.11, 2.9.1) (push) Has been cancelled
publish pypi / Upload release to PyPI (push) Has been cancelled
* gemma4 support

* fixes

* chore: lint
v0.16.1
2026-04-02 17:46:46 -04:00
Wing Lian
573726c839 upgrade torchao to 0.17.0 (#3569)
Some checks failed
ci-cd / build-axolotl (<nil>, 128, 12.8.1, linux/amd64,linux/arm64, 3.11, 2.9.0) (push) Has been cancelled
ci-cd / build-axolotl (<nil>, 128, 12.8.1, linux/amd64,linux/arm64, 3.12, 2.10.0) (push) Has been cancelled
ci-cd / build-axolotl (<nil>, 128, 12.8.1, true, linux/amd64,linux/arm64, 3.11, 2.9.1) (push) Has been cancelled
ci-cd / build-axolotl (<nil>, 130, 13.0.0, linux/amd64,linux/arm64, 3.11, 2.9.1) (push) Has been cancelled
ci-cd / build-axolotl (<nil>, 130, 13.0.0, linux/amd64,linux/arm64, 3.12, 2.10.0) (push) Has been cancelled
ci-cd / build-axolotl-uv (<nil>, 128, 12.8.1, linux/amd64,linux/arm64, 3.11, 2.9.1) (push) Has been cancelled
ci-cd / build-axolotl-uv (<nil>, 128, 12.8.1, linux/amd64,linux/arm64, 3.12, 2.10.0) (push) Has been cancelled
ci-cd / build-axolotl-uv (<nil>, 128, 12.8.1, true, linux/amd64,linux/arm64, 3.12, 2.9.1) (push) Has been cancelled
ci-cd / build-axolotl-uv (<nil>, 130, 13.0.0, linux/amd64,linux/arm64, 3.11, 2.9.1) (push) Has been cancelled
ci-cd / build-axolotl-uv (<nil>, 130, 13.0.0, linux/amd64,linux/arm64, 3.12, 2.10.0) (push) Has been cancelled
publish pypi / Create Release (push) Has been cancelled
ci-cd / build-axolotl-cloud (<nil>, 128, 12.8.1, linux/amd64,linux/arm64, 3.11, 2.9.0) (push) Has been cancelled
ci-cd / build-axolotl-cloud (<nil>, 128, 12.8.1, linux/amd64,linux/arm64, 3.12, 2.10.0) (push) Has been cancelled
ci-cd / build-axolotl-cloud (<nil>, 128, 12.8.1, true, linux/amd64,linux/arm64, 3.11, 2.9.1) (push) Has been cancelled
ci-cd / build-axolotl-cloud (<nil>, 130, 13.0.0, linux/amd64,linux/arm64, 3.11, 2.9.1) (push) Has been cancelled
ci-cd / build-axolotl-cloud (<nil>, 130, 13.0.0, linux/amd64,linux/arm64, 3.12, 2.10.0) (push) Has been cancelled
ci-cd / build-axolotl-cloud-uv (<nil>, 128, 12.8.1, linux/amd64,linux/arm64, 3.11, 2.9.1) (push) Has been cancelled
ci-cd / build-axolotl-cloud-uv (<nil>, 128, 12.8.1, linux/amd64,linux/arm64, 3.12, 2.10.0) (push) Has been cancelled
ci-cd / build-axolotl-cloud-uv (<nil>, 128, 12.8.1, true, linux/amd64,linux/arm64, 3.12, 2.9.1) (push) Has been cancelled
ci-cd / build-axolotl-cloud-uv (<nil>, 130, 13.0.0, linux/amd64,linux/arm64, 3.11, 2.9.1) (push) Has been cancelled
ci-cd / build-axolotl-cloud-uv (<nil>, 130, 13.0.0, linux/amd64,linux/arm64, 3.12, 2.10.0) (push) Has been cancelled
ci-cd / build-axolotl-cloud-no-tmux (<nil>, 128, 12.8.1, true, 3.11, 2.9.1) (push) Has been cancelled
ci-cd / build-axolotl-cloud-no-tmux (<nil>, 130, 13.0.0, <nil>, 3.11, 2.9.1) (push) Has been cancelled
publish pypi / Upload release to PyPI (push) Has been cancelled
* upgrade to torchao 0.17.0

* upgrade mistral-common too

* chore: lint

* patch fix for torchao low bit optimizers

* fix up

* propagate dtype

* fix test for ao change

* address PR comments
v0.16.0
2026-04-02 10:18:00 -04:00
NanoCode012
842fa039dd feat: add sonicmoe fused lora support (#3519)
* feat: add sonicmoe fused lora support

* fix: forgot to add file

* feat: add test

* feat: add lora support for other routes

* fix: add int8 lora support

* fix: add qwen35_moe interleave support

* fix: qwen3_5_moe loss

* chore: lint

* address some pr comments

* fix test imports

* add support matrix for moe kernels [skip ci]

---------

Co-authored-by: Wing Lian <wing@axolotl.ai>
2026-04-02 08:53:48 -04:00
NanoCode012
16e32232fb feat(docs): comprehensive improvement (#3564)
* docs: comprehensive documentation improvements for humans and agents

New human docs:
- grpo.qmd: GRPO deep dive (async, rewards, IS correction, scaling)
- ebft.qmd: EBFT guide (structured/strided modes, feature extraction)
- choosing_method.qmd: decision tree for SFT vs LoRA vs DPO vs GRPO
- vllm_serving.qmd: vLLM setup for GRPO (server/colocate, LoRA sync)
- training_stability.qmd: monitoring, NaN debugging, OOM, healthy metrics

New agent docs:
- AGENTS_SFT.md: agent reference for supervised fine-tuning
- AGENTS_DPO.md: agent reference for preference learning (DPO/KTO/ORPO)

Updated existing docs:
- rlhf.qmd: cross-references to new GRPO/EBFT/choosing-method guides
- getting-started.qmd: reorganized Next Steps with links to new guides
- debugging.qmd: link to training stability guide
- _quarto.yml: added new pages to sidebar navigation

Removed:
- bak.agents.md: stale backup that confused agents

* docs: trim duplicated generic config from AGENTS_DPO.md

Remove boilerplate training params (optimizer, gradient_checkpointing,
flash_attention, etc.) from each method template. These are not
preference-learning-specific and are already covered in AGENTS_SFT.md.
Config templates now show only method-specific fields with a reference
to AGENTS_SFT.md for the rest.

* docs: deduplicate across new doc pages

- grpo.qmd: collapse vLLM setup section to brief config + link to
  vllm_serving.qmd; collapse IS correction to essentials + link;
  replace full monitoring tables with summary + link to
  training_stability.qmd
- vllm_serving.qmd: remove duplicated async/IS config reference tables
  (already in grpo.qmd config reference); replace full example config
  with link to grpo.qmd quick start
- ebft.qmd: trim generic training params in quick start config

* fix: train scripts

* feat: split files into cleaner parts

* fix: cleanup pretraining docs

---------

Co-authored-by: Wing Lian <wing.lian@gmail.com>
2026-04-02 08:01:26 -04:00
Andrew Wu
50e9573f24 Update lm-eval for transformers v5 support (#3571) [skip ci] 2026-04-01 23:25:18 -04:00
Edward Zion Saji
55a7950e3d fix: DPO tool role KeyError (#3217), dataset hash output_dir (#3303), config validators (#3538) [skip ci]
* fix: DPO tool role KeyError, dataset hash output_dir, config validators [skip-e2e]

- Add 'tool' to default role_map_inv in dpo/chat_template.py default() and
  argilla_chat() so datasets with tool-call messages no longer raise
  KeyError: 'tool' (closes #3217)

- Fix generate_dataset_hash_from_config to use canonical tokenizer config +
  overrides content instead of tokenizer.name_or_path when added_tokens_overrides
  is set, preventing cache busting when only output_dir changes (closes #3303)

- Add three Pydantic config validators to AxolotlConfigWCapabilities:
  * save_strategy: 'best' requires metric_for_best_model
  * streaming=True is incompatible with val_set_size > 0
  * lora_target_modules list entries must be valid Python regex patterns

- Tests for all three changes

* review: condense comment in shared.py, swap Mistral model for SmolLM2-135M in test_hash

* chore: lint

* move the validators out of the w/ capabilities schema

---------

Co-authored-by: Wing Lian <wing@axolotl.ai>
2026-04-01 19:57:07 -04:00
VED
c92b71bd0c MX QAT patch (#3553)
* qat patch

* tests fixes

* fixup per PR code review

* use state dict hooks to handle dequant for saving safetensors from transformers

* use transformers torch ao quantizer hooks to save mx quantized model

---------

Co-authored-by: Wing Lian <wing@axolotl.ai>
Co-authored-by: Wing Lian <wing.lian@gmail.com>
2026-04-01 18:21:02 -04:00
Wing Lian
6c92b5c31c lazy load trainer classes to prevent unnecesary imports (#3568)
* lazy load trainer classes to prevent unnecesary imports

* make the lazy load a common util
2026-04-01 13:29:04 -04:00
Joaquin Hui
1b1fc917bc Add precompute_ref_log_probs to config schema (#3555) [skip ci]
* Add precompute_ref_log_probs to config schema

* chore: add description for config

* Add test for precompute_ref_log_probs and move to training args

* useing precompute logprobs as the default slows down CI as it has to precompute

---------

Co-authored-by: NanoCode012 <nano@axolotl.ai>
Co-authored-by: Wing Lian <wing@axolotl.ai>
2026-04-01 13:28:40 -04:00
Mario Župan
96ae8bdd1d Add troubleshooting note for GLM4 GGUF MTP mismatch (#3559) [skip ci]
* Add troubleshooting note for GLM4 GGUF MTP mismatch

* Fix JSON syntax for num_nextn_predict_layers example

* fix: concise

---------

Co-authored-by: NanoCode012 <nano@axolotl.ai>
2026-04-01 10:05:06 -04:00
github-actions[bot]
438ea7b045 chore: update pre-commit hooks (#3567) [skip ci]
Co-authored-by: SalmanMohammadi <25081738+SalmanMohammadi@users.noreply.github.com>
2026-04-01 10:04:21 -04:00
kallewoof
f6c122b76d allow bf16 flag but warn (#3563) [skip ci]
* allow bf16 flag but warn

Reason: when doing e.g. LoRA merges with CUDA_VISIBLE_DEVICES=, this will unnecessarily crash, even though the LoRA merge operation would have finished successfully. This seems to warrant changing it to a warning instead, as the code will most likely crash later if bf16 is unavailable and training begins anyway.

* don't use deprecated LOG.warn

* update tests to reflect validation change
2026-04-01 09:54:01 -04:00
VED
9e64c76326 qwen3.5 configs (#3554) [skip ci]
* qwen3.5  configs

* update shared experts readme
2026-04-01 09:19:31 -04:00
Wing Lian
5e5603c9aa upgrade transformers to 5.4.0 (#3562)
* upgrade transformers to 5.4.0

* allow fail for tests requiring phi3 tokenizer

* ring-flash-attn skips

* skip tests for now
2026-03-31 19:15:59 -04:00
kallewoof
a4c94416eb bug-fix: only apply patches when CUDA is available (#3561)
* bug-fix: only apply patches when CUDA is available

This will otherwise crash when performing operations with CUDA_VISIBLE_DEVICES=, such as LoRA merging on CPU.

This patch only patches the Qwen 3.5 model, since that's the only one I've tested. This patch should most likely check torch.cuda for all other models as well. One limitation here is that I'm assuming the user runs CUDA, but that assumption is not restricted to this patch so it is probably fine.

* include patch_qwen3_next_modeling_packing, patch_qwen3_5_moe_modeling_packing, and patch_qwen3_5_vlm_flash_attention in cuda guard
2026-03-31 19:05:15 -04:00
Andrew Wu
a81feabbd9 DPO transformers v0.29 fixes (#3560) [skip ci]
* Deperecate dpo_norm_loss

* Rename chosen/rejected_input_ids to chosen/rejected_ids to match TRL https://github.com/huggingface/trl/pull/5179

* Remove deprecated rpo_alpha

* Remove dead_code tokenize_row

* Add _tokenize override to prevent double bos token on Llama DPO

* Fix DPO loss type now list not string

* Linting fix

* PR fixes

* update _tokenize override for DPO for multimodal
2026-03-31 19:04:53 -04:00
VED
bb622b83de super nemo support (#3508)
* nemo support

* config

* rename , config

* nemotron packing

* config fix

* read me + configs

* gc compat bug

* config chnages for qwen  and pad token nemo

* patch nemotron_h  weight renaming so it doesn't get reversed to embedding (singular noun) on checkpoint save

* lint

* revert qwen3.5 config changes, not needed in this pr

* lint

* Update examples/nemotron-h/120b-a12b-qlora.yaml

Co-authored-by: NanoCode012 <kevinvong@rocketmail.com>

* Update examples/nemotron-h/nano-30b-a3b-qlora.yaml

Co-authored-by: NanoCode012 <kevinvong@rocketmail.com>

* readme + validation

* lazy load comment

* Update examples/nemotron-h/120b-a12b-qlora.yaml

Co-authored-by: NanoCode012 <kevinvong@rocketmail.com>

* val fix

* add nemo to multi packing

---------

Co-authored-by: Wing Lian <wing@axolotl.ai>
Co-authored-by: NanoCode012 <kevinvong@rocketmail.com>
2026-03-30 18:12:50 -04:00
Wing Lian
00dee05fc6 support flattening/packing for GRPO (#3552)
* support flattening/packing for GRPO

* more flattening

* fix tests

* improve dead vllm handling

* refactor out process handling for vllm serve and move bench flattening tests to gpu tests

* add validation for flattening with liger

* isolate batch flattening test

* flaky test
2026-03-28 13:15:54 -04:00
Wing Lian
99bde0124c deprecate torch 2.8.0 support (#3550)
* deprecate torch 2.8.0 support

* shell lint

* odd naming of manylinux wheels for x86
2026-03-25 18:22:47 -04:00
Wing Lian
5191e4eb53 More minor RL fixes (#3551)
* fix: handle get_open_port import across TRL versions

TRL 0.29+ removed get_open_port from exports; fall back to importing
directly from vllm.utils or vllm.utils.network_utils.

* support DP with vllm and make generation_batch_size confifurable
2026-03-25 18:17:49 -04:00
Wing Lian
74b959e035 dispatch scored rollouts to plugins, extend path for external plugins, better handle errors with vllm /reset_prefix_cache (#3549)
* dispatch scored rollouts to plugins, extend path for external plugins, better handle errors with vllm /reset_prefix_cache

* address PR comments, lint
2026-03-25 11:19:15 -04:00
VED
b55706b9f6 feat:merge-lora iterate through bins without loading (#3095)
* merge_method added

* merge_efficient core implement

* Update src/axolotl/cli/merge_lora.py

Co-authored-by: Wing Lian <wing.lian@gmail.com>

* Update src/axolotl/utils/lora_merge_efficient.py

Co-authored-by: Wing Lian <wing.lian@gmail.com>

* standard to leagcy + rstrip + try/except for do_merge_lora_efficient(cfg=cfg)

* fix: 'dict' object has no attribute 'lora_alpha'

* into -> debug

* lint

* lint2

* moved everythign to cpu + peformance improvments

* lint

* Update src/axolotl/cli/merge_lora.py

Co-authored-by: Dan Saunders <danjsaund@gmail.com>

* Update src/axolotl/cli/merge_lora.py

Co-authored-by: Dan Saunders <danjsaund@gmail.com>

* string handeling +  try except remove

* merge_method -> merge_lora_methods

* remove duplicate cal + safetensor + move to lora_merge.py

* lint

* handle quant-dequant, handle experts

* fix parameter merging and prefer peft's native merge logic per module

---------

Co-authored-by: Wing Lian <wing.lian@gmail.com>
Co-authored-by: Dan Saunders <danjsaund@gmail.com>
2026-03-25 08:41:32 -04:00
Avaya Aggarwal
ff0f67c730 feat: add custom routing support for ernie4_5_moe, and hunyuan_v1_moe (#3526)
* feat: add Ernie 4.5 and subsequently custom routing support

* Update routing.py

* chore: lint

* fix minor nits

* removed deepseek v2

* remove unneeded change

---------

Co-authored-by: Wing Lian <wing@axolotl.ai>
2026-03-25 08:40:31 -04:00
Matthew Hambrecht
678ebb1bb2 Fix Ray train crashing after succeeding (#3542) [skip ci] 2026-03-25 07:38:28 -04:00
Wing Lian
c2bd75aff6 Nemo gym integration (#3516) [skip ci]
* nemo gym integration with grpo wip

* mostly working

* cleanup

* simplify

* update docs

* nemo gym support wip

* cleanup

* chore: lint

* address PR review and add more tests

* chore: lint

* post merge lora fixes for CI (#3536) [skip ci]

* post merge lora fixes for CI

* handle lora kernel auto-enable for moe without grouped_mm

* prefer not to import torch in schema validation

* address pr comments, add timeout, add tests

* roundup_power2_divisions not needed with newer pytorch versions (#3540)

* roundup_power2_divisions not needed with newer pytorch versions

* remove typo

* update qwen3.5 moe 35b-a3b yaml for 5090

* more bug fixes

* fix tests to match updated trainer

* don't use fa2 for hooks test

* reset plugins on the instance

* retry download

* fix references to renamed axolotl_cfg property on trainer

* Fix ref to trainer cfg

* fix: robust handling of race condition on patching check (#3543) [skip ci]

* EBFT: Matching Features, Not Tokens: Energy-Based Fine-Tuning of Language Models (#3527) [skip ci]

* EBFT wip

* fixes

* more fixeS

* add missing strided module

* ebft fixes for multi-turn

* make ebft work with async

* add example for ebft w qwen3.5

* fix for split thinking and update yaml for lora over linear attention only

* enforce_eager for vllm arg in schema

* fix sync weights

* fix multi-gpu

* handle updated sig for mm

* ddp fixes

* improve multi-gpu handling, don't calculate logits, adaptive completion length

* chore: lint

* chore: lint

* support completion_mean

* Address corereview feedback

* clamp min IS ratio

* Address PR code review

* more fixes identified

* address code review

* Fix property from rebase conflict

* fix for ebft sync and update docs

* make trainer loss patch check a solo test

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 07:38:06 -04:00
NanoCode012
2fb72798e0 Revert "feat: move to uv first" (#3544)
This reverts commit 1f1ebb8237.
2026-03-25 16:12:36 +07:00
NanoCode012
1f1ebb8237 feat: move to uv first 2026-03-25 16:06:37 +07:00
Wing Lian
c50c4acbf4 EBFT: Matching Features, Not Tokens: Energy-Based Fine-Tuning of Language Models (#3527) [skip ci]
* EBFT wip

* fixes

* more fixeS

* add missing strided module

* ebft fixes for multi-turn

* make ebft work with async

* add example for ebft w qwen3.5

* fix for split thinking and update yaml for lora over linear attention only

* enforce_eager for vllm arg in schema

* fix sync weights

* fix multi-gpu

* handle updated sig for mm

* ddp fixes

* improve multi-gpu handling, don't calculate logits, adaptive completion length

* chore: lint

* chore: lint

* support completion_mean

* Address corereview feedback

* clamp min IS ratio

* Address PR code review

* more fixes identified

* address code review

* Fix property from rebase conflict
2026-03-24 18:43:46 -04:00
Wing Lian
e9883c91d4 fix: robust handling of race condition on patching check (#3543) [skip ci] 2026-03-24 16:43:43 -04:00
Wing Lian
e412370877 roundup_power2_divisions not needed with newer pytorch versions (#3540)
* roundup_power2_divisions not needed with newer pytorch versions

* remove typo

* update qwen3.5 moe 35b-a3b yaml for 5090

* more bug fixes

* fix tests to match updated trainer

* don't use fa2 for hooks test

* reset plugins on the instance

* retry download

* fix references to renamed axolotl_cfg property on trainer

* Fix ref to trainer cfg
2026-03-24 15:40:05 -04:00
Wing Lian
86be9f329e post merge lora fixes for CI (#3536) [skip ci]
* post merge lora fixes for CI

* handle lora kernel auto-enable for moe without grouped_mm

* prefer not to import torch in schema validation
2026-03-23 02:26:10 -04:00
Wing Lian
0e583efeaa increase rtol, codecov informational only, don't silently fail errors w curl (#3534) [skip ci] 2026-03-22 13:54:03 -04:00
Wing Lian
b3289fd190 feat: LoRA kernel support for bias, dropout, dora, embeddings (#3528) [skip ci]
* feat: LoRA kernel support for bias, dropout, dora, embeddings

* chore: lint

* chore: lint

* address PR feedback, add regression tests, add fsdp2 tests for lora kernels

* update tests for new sigs

* update tests now that bias and dropout are supported
2026-03-22 13:53:19 -04:00
Wing Lian
a67392c427 liger support for qwen 3.5 and fused rmsnorm+gated (#3531) [skip ci]
* liger support for qwen 3.5 and fused rmsnorm+gated

* support for qwen 3.5 moe

* fix version ref

* fixups for PR code review
2026-03-22 13:19:21 -04:00
Wing Lian
5b2e3f00ce fix: handle connection errors when checking user whoami (#3529) 2026-03-22 09:11:17 -04:00