axolotl

Author	SHA1	Message	Date
Wing Lian	0102ca5943	fix cfg merge	2025-07-06 19:11:46 -04:00
Wing Lian	97e8c01a70	tweak losses	2025-07-06 18:55:16 -04:00
Wing Lian	5c4705b185	unset fa	2025-07-06 13:27:55 -04:00
Wing Lian	47a88da330	set mbsz and revert non-packed test	2025-07-06 12:27:25 -04:00
Wing Lian	07ab737a55	set tokenizer_config in fixture	2025-07-06 12:24:21 -04:00
Wing Lian	c40da3b5eb	use shared fixture for preprocessed alpaca dataset	2025-07-06 11:44:31 -04:00
Dan Saunders	35fdbce102	Ensure device mesh patching is applied (#2842 ) * move patches; make patch stronger * fix broken tests * guard sequence_parallel_degree comparison against none --------- Co-authored-by: Wing Lian <wing@axolotl.ai>	2025-06-29 22:16:32 -04:00
Wing Lian	81893c775c	Accelerate 1.8.1 and BNB 0.46.0 update (#2815 ) * update accelerate to v1.8.0 * update bnb also * fix multigpu ci timeout * fix test set size * use latest accelerate 1.8.1 * disable default dtype	2025-06-28 15:29:19 -04:00
Wing Lian	a85efffbef	bump transformers==4.52.4 (#2800 ) [skip ci] * bump transformers==4.52.4 * don't use hf offline for qwen tokenizer * increase timeout * don't use methodtype * increase timeout * better assertion logging * upgrade deepspeed version too	2025-06-18 15:46:14 -04:00
Wing Lian	ccc94da8ad	KD fix w/ online distillation (#2700 ) [skip ci] * kd fixes * fix collator setup * fix input args * better handling to drop string fields for kd with raw dataset * kd trainer has kd temp as part of the init * drop top_k before softmax * simplfy and remove zscore * WIP chunked KD loss with autograd wrapper * more fixes and liger-type chunked loss * collator cls for plugins * remove debugging * additional plugin collator kwargs, don't scale up kd loss by t^2 * don't need temp arg to distill method * online kd wip * add close to comment block * suport sampling params/max new tokens * handle when no custom collator is used in plugins * logsumexp trick: * fix check * shift off the first empty token * fix length of padding * use max not min * temp scale kd loss at end * support for dynamic plugin training args mixins and symmetric kl * chore: lint * fix trainer callback base class * Fix decay * accept compressed responses for smaller wire payload * post-rebase lint * more KD updates * increase hyperparams_count for gradients for added normalize_topk * fix to remove attention_mask * rename vars for consistency * fix rebase issues * default to dropping last batch in multipack batch sampler * improve handling of train len * init collator_cls_and_kwargs * explicit drop_last=False when checking for multipack completeness * use separate v2 loader for kd * fix kd tests to use subprocess so it picks up kd training args * default value for kd_beta arg * use updated dataset for ci * longer timeout for e2e	2025-06-17 12:09:13 -04:00
NanoCode012	21388cf615	Fix: lora kernel pre-patch applied despite post-patch not applied (#2772 ) * fix: do not pre-patch self attention if lora dropout non-zero * fix: add test to check patch not applied * fix: test * fix: test config check * fix where we check so that tests don't break * fix: test --------- Co-authored-by: Wing Lian <wing@axolotl.ai>	2025-06-14 11:54:06 -07:00
Wing Lian	b2274d430b	support for QAT w RL (DPO) (#2776 )	2025-06-13 10:00:35 -04:00
Wing Lian	ace9287c96	update loss value for flakey e2e test (#2786 ) [skip ci] * update loss value for flakey e2e test * use pytest skip * parametrize combinations	2025-06-12 18:06:14 -04:00
Dan Saunders	00cda8cc70	Data loader refactor (#2707 ) * data loading refactor (wip) * updates * progress * pytest * pytest fix * lint * zero_first -> filelock, more simplifications * small simplification * import change * nit * lint * simplify dedup * couldnt resist * review comments WIP * continued wip * minor changes * fix; remove contrived test * further refactor * set default seed in pydantic config * lint * continued simplication * lint * renaming and nits * filelock tests * fix * fix * lint * remove nullable arg * remove unnecessary code * moving dataset save fn to shared module * remove debug print * matching var naming * fn name change * coderabbit comments * naming nit * fix test	2025-06-10 19:53:07 -04:00
Wing Lian	7909bfb076	add manual seed for flaky test_geglu_backward test (#2763 ) [skip ci]	2025-06-05 09:23:17 -07:00
Wing Lian	c67910fa6f	bump hf deps (#2735 ) [skip ci] * bump hf deps * upgrade liger-kernel too * install cce from fork for transformers fix * fix reference to vocab size in gemma3 patch * use padding_idx instead of pad_token_id * remove fixed gemma3 patch * use updated cce fork * fix local mllama cce patches w docstring * add test for multipack with trainer setup and fix trainer for trainer refactor upstream * bump modal version * guard for iterable datasetS * mllama model arch layout changed in latest transformers * fix batch sampler with drop_last * fix: address upstream vlm changes for lora * fix: update references to old lora target path * fix: remove mllama fa2 patch due to upstream fix * fix: lora kernel patch path for multimodal models * fix: removed mllama from quarto * run test for came optim on 2.6.0+ * fix fsdp2 patch and remove deprecated patch * make sure to set sequence_parallel_degree for grpo * Add SP test for GRPO * add sp to grpo config for trainer * use reward_funcs as kwarg to grpo trainer * fix the comprehension for reward funcs * reward funcs already passed in as args * init sp_group right before training * fix check for adding models to SP context * make sure to pass args to super * upgrade deepspeed * use updated trl and add reasoning flags for vllm * patch the worker --------- Co-authored-by: NanoCode012 <nano@axolotl.ai>	2025-06-05 07:20:33 -07:00
Dan Saunders	1d91d905c9	remove deprecated wandb env var (#2751 ) * remove deprecated wandb env var * remove os.environ wandb setting; unused loggers * remove os.environ wandb setting; unused loggers	2025-06-03 14:04:15 -07:00
NanoCode012	6778856804	Fix: RL base feature parity (#2133 ) * feat: add num_proc and load from cache for rl mapping * fix: refactor sft and rl trainer to set same base args * feat: add report_to to set run name * fix: consolidate handling of fp16, bf16, tf32 kwarg * chore: consolidate eval_strat, loraplus, lr sched, max_length * fix: deprecate old types * fix: adding missing Any * fix: max_steps incorrectly set * fix: remove unnecessary datacollator kwarg insert and pop * fix: update default max_steps * fix: add missing weight_decay handling * fix: ignore max_length for grpo * feat: update CI on trainer_builder * fix: comments * improve handling of warmup/logging steps * use transformers default for logging steps, not None * fix: remove redundant override * fix: lint * feat: allow custom optim for rl methods * fix: duplicate optim setting * fix(test): set sequence_parallel_degree default in base cfg * feat: add handling for seed and SP/ring-attn config * chore: add back return typing from rebase * fix(test): use RLType directly to skip needing to validate * feat: split training builder into sub modules * fix: remove deprecated clause * chore: add missing config to doc * fix: update quarto autodoc * fix: import path for trainer builder and submodules * fix: remove redundant configs from rebase mistake * chore: simplify dynamo check * fix: optimizer_cls_and_kwargs to be passed into trainer_kwargs * fix: add missing rex from rebase * fix: move pop optimizer_cls_and_kwargs * fix: pop optimizer cls in rl too * fix: leftover bug from rebase * fix: update handling of trainer_cls in RL * fix: address pr feedback * feat: call hook_pre_create_trainer for rl * chore: lint * fix: return notimplemented for ppo * feat: moved torch compile to base and refactor collator setting * chore: remove unused importlib.util import * fix: optimizer cls not being popped * feat: move epoch setting to base * fix: catch unhandled custom optimizer * fix: remove duplicate lora plus setting * chore: refactor if condition * chore: refactor set_base_training_args into smaller modules * fix: address TrainerBuilderBase class variables to instance var * fix: add handling for beta3 and episilon2 * fix: change to pass dict via arg instead of updating dict * chore: simplify if condition * fix: force access to lr & weight decay in case not provided to early error * fix: remove log sweep * chore: refactor if condition * fix: address renamed cfg * fix: improve handling of cosine hyp * fix: remove unused params * chore: refactor * chore: clarify doc safetensors * fix: update import path to be unified following comments * fix: duplicate kwargs passed * feat: return separate trainer_kwargs * chore: refactor * chore: refactor based on comments * chore: refactor based on comments * fix: move gpustats callback to base * chore: create trainer_cls_args first based on comments * fix: ipo label smoothing passed incorrectly * feat: add optimizer parity for RL methods with test * feat: add parity for optimizer in RM/PRM and add test * fix: remove redundant function override for orpo/cpo batch metrics * fix: improve handling of dpo_label_smoothing and merge issue * fix: test fixture returning wrong field * fix: address avoid direct modify fixture * chore: minor refactor * Revert "chore: refactor" This reverts commit `99c8859eb0`. * feat: rename trainer_builder to builders --------- Co-authored-by: Wing Lian <wing@axolotl.ai>	2025-05-30 11:21:47 +07:00
Dan Saunders	2962a398b7	Lora kernels fix (#2732 ) * fix lora kernel patching and improve test * simplification	2025-05-28 10:03:43 -04:00
salman	65c5481120	Rank 0-only logging (#2608 ) Co-authored-by: Wing Lian <wing@axolotl.ai>	2025-05-28 14:57:30 +01:00
salman	5fca214108	QAT (#2590 ) QAT and quantization w/torchao	2025-05-28 12:35:47 +01:00
Dan Saunders	b5f1e53a0f	models.py -> loaders/ module refactor (#2680 ) * models.py -> loaders/ module refactor * refactor ModelLoader class * plugin manager changes * circular import fix * pytest * pytest * minor improvements * fix * minor changes * fix test * remove dead code * coderabbit comments * lint * fix * coderabbit suggestion I liked * more coderabbit * review comments, yak shaving * lint * updating in light of SP ctx manager changes * review comment * review comment 2	2025-05-23 15:51:11 -04:00
Dan Saunders	5f8f817200	SP context manager update (#2699 ) * utilize accelerate prepare_data_loader with patching * lint * cleanup, fix * update to support DPO quirk * coderabbit commits, cleanup, remove dead code * fix * move ring attn patching to sp ctx manager * lint * lint * test fix * test fix	2025-05-22 11:18:32 -04:00
Dan Saunders	6aa41740df	SP dataloader patching + removing custom sampler / dataloader logic (#2686 ) * utilize accelerate prepare_data_loader with patching * lint * cleanup, fix * update to support DPO quirk * small change * coderabbit commits, cleanup, remove dead code * quarto fix * patch fix * review comments * moving monkeypatch up one level * fix	2025-05-21 11:20:20 -04:00
Wing Lian	a27b909c5c	GRPO fixes (peft) (#2676 ) * don't set peft_config on grpo to prevent double peft wrap * remove overrides needed to support bug * fix grpo tests * require more CPU for multigpu to help with torch compile for vllm	2025-05-16 15:47:03 -04:00
Wing Lian	c0a0c7534c	Activation checkpointing with offloading to disk with prefetch (#2663 ) * offload activations to disk instead of CPU RAM * add prefetch * Disco :dance: * include offload_disk in e2e test for AC * document and make sure to cleanup * fix annotation to match docs * fix docs build * address PR feedback	2025-05-13 16:39:39 -04:00
Dan Saunders	80304c26a7	SP GRPO support + batch SP fixes (#2643 ) * ctx manager for SP * updates * update * further simplifying * simplifying * simplifying * reorg * batch api HF adapter for ring-flash-attn; cleanup and improvements * update * adding all batch ring-flash-attn methods via single adapter * fix * fixes for batch API funcs, simplify * fix * grpo sp support * progress * stronger subclassing of TRL GRPO trainer; custom distributed sampler * subclassing constructor * progress * finalizing SP + GRPO trainer * minimize diffs to GRPO trainer * remove (most of) the custom GRPO trainer logic * debug * debug * update * update * update * progress * cleanup * cleanup * minor changes * update * update * update * small changes * updates * cleanup; torch.compile ring_flash_attn functions to prevent numerical instability; lint * spacing * cleanup; log in pydantic model config only on main process * remove comment * fix sp sampler, update to latest upstream code, doc * add docs * update quartodoc autodoc contents * fix, simplifications * fixes + simplifications * review comments * lint * removing main process only logs in favor of #2608 * fixes, additional smoke test * updatse * more tests * update * fix grad accum bug (sort of) * lint, tests * todo	2025-05-12 17:52:40 -04:00
Wing Lian	f34eef546a	update doc and use P2P=LOC for brittle grpo test (#2649 ) * update doc and skip brittle grpo test * fix the path to run the multigpu tests * increase timeout, use LOC instead of NVL * typo * use hf cache from s3 backed cloudfront * mark grpo as flaky test dues to vllm start	2025-05-12 14:17:25 -04:00
Wing Lian	c7b6790614	Various fixes for CI, save_only_model for RL, prevent packing multiprocessing deadlocks (#2661 ) * lean mistral ft tests, remove e2e torch 2.4.1 test * make sure to pass save_only_model for RL * more tests to make ci leaner, add cleanup to modal ci * fix module for import in e2e tests * use mp spawn to prevent deadlocks with packing * make sure cleanup shell script is executable when cloned out	2025-05-12 10:51:18 -04:00
Dan Saunders	47e0e71bc8	don't sort multipack sampler (#2657 ) * don't sort multipack sampler * increased packing efficiency increases loss --------- Co-authored-by: Wing Lian <wing@axolotl.ai>	2025-05-09 20:28:58 -04:00
Wing Lian	0f3587174d	swap tinymodels that have safetensors for some ci tests (#2641 )	2025-05-07 15:06:07 -04:00
xzuyn	25e6c5f9bd	Add CAME Optimizer (#2385 )	2025-05-07 10:31:46 -04:00
Wing Lian	0d71b0aa5f	Configurable embeddings upcast (#2621 ) * fsdp embeddings should be float32 per comment * patch peft to not upcast everything * add tabs back to code check * fix import * add configurable option and fix check * add check for dtypes * move embeddings test to patch dir * fix test * fix comment and logic	2025-05-06 23:40:44 -04:00
Wing Lian	cd84325253	allow plugins to return their own dataset (#2617 ) [skip ci] * allow plugins to return their own dataset * add post_trainer_create and wire up * add hook check * address PR feedback: * remove annotation causing circular import	2025-05-06 20:05:51 -04:00
Rahul Tuli	996fc124e5	Add: Sparse Finetuning Integration with llmcompressor (#2479 ) * Add: SFTPlugin with llmcompressor * Update: review comments! * Add:llmcompressor instalable * pre commit hooks * Use: warning over warn * Revert: TODO's * Update llmcompressor version to latest * Apply suggestions from @markurtz Co-authored-by: Mark Kurtz <mark.j.kurtz@gmail.com> * Address review comments from @markurtz * Add: llcompressor installable * Rename: sft.yaml to sparse-finetuning.yaml * Use: absolute import * Update model config * Move: LLMCompressorPlugin into it's own submodule * Add: `llm_compressor` integration documentation * Rebase and updates! * Tests, Style, Updates * Add: .qmd file * Address Review Comments: * deleted redundant docs/llm_compressor.qmd * incorporated feedback in integration README.md * added llmcompressor integration to docs/custom_integrations.qmd Signed-off-by: Rahul Tuli <rtuli@redhat.com> * Add: line about further optimizations using llmcompressor Signed-off-by: Rahul Tuli <rtuli@redhat.com> * Apply patch from @winglian Signed-off-by: Rahul Tuli <rtuli@redhat.com> * Fix: Test Signed-off-by: Rahul Tuli <rtuli@redhat.com> * additional fixes for docker and saving compressed * split llmcompressor from vllm checks * Reset session between tests Signed-off-by: Rahul Tuli <rtuli@redhat.com> * move decorator to test method instead of class * make sure to reset the session after each test * move import of llmcompressor to reset session inside test --------- Signed-off-by: Rahul Tuli <rtuli@redhat.com> Co-authored-by: Mark Kurtz <mark.j.kurtz@gmail.com> Co-authored-by: Wing Lian <wing@axolotl.ai>	2025-05-01 12:25:16 -04:00
Aleksandr Dremov	41a1ec0c95	Plugins create_lr_scheduler support (#2584 ) * lr_scheduler support * fix * Update scheduler.py * Update scheduler.py * cfg handling * black * remove debug * remove adding the axolotl cfg to the scheduler mixin --------- Co-authored-by: Wing Lian <wing@axolotl.ai>	2025-04-29 17:08:30 -04:00
Dan Saunders	ecac731922	auto-enable lora kernels where possible (#2589 ) * auto-enable lora kernels where possible * test * revert change to example yaml * naming * remove print * slight logic change	2025-04-29 16:18:49 -04:00
Wing Lian	07e4f2e25b	support for qwen3 with lora kernels (#2588 ) * support for qwen3 with lora kernels * fix patch * typo	2025-04-29 15:02:49 -04:00
Dan Saunders	c7d07de6b4	Fix eval + add smoke test (#2586 ) * fix evaluate CLI * add smoke test * fix naming * lint	2025-04-29 12:58:54 -04:00
divyanshuaggarwal	170cdb5be9	Add Post_model_load, post_lora_load, post_train, post_train_unload function calls (#2539 ) * Update train.py add post_model_load and post_lora_load model calss. * Update train.py add post_train and post_train_unload function calls * Update train.py * Update base.py * Update train.py * chore: lint * clarify plugin hooks * Update src/axolotl/integrations/base.py Co-authored-by: Dan Saunders <danjsaund@gmail.com> * Update src/axolotl/utils/models.py Co-authored-by: Dan Saunders <danjsaund@gmail.com> * Update src/axolotl/utils/models.py Co-authored-by: Dan Saunders <danjsaund@gmail.com> * Update src/axolotl/integrations/base.py Co-authored-by: Dan Saunders <danjsaund@gmail.com> * Update models.py * Update models.py * remove extra call to post_model_load * chore: lint * add test for hooks and gc trainer * disable duplicated code check for test * fix the path and add better handling --------- Co-authored-by: Wing Lian <wing@axolotl.ai> Co-authored-by: Dan Saunders <danjsaund@gmail.com>	2025-04-28 10:10:28 -04:00
Wing Lian	40f4ea23ab	replace references to random 68m model w 135m smollm2 (#2570 ) [skip ci] * replace references to random 68m model w 135m smollm2 * use AutoTokenizer for smollm2	2025-04-28 10:08:07 -04:00
Wing Lian	dc4da4a7e2	update trl to 0.17.0 (#2560 ) * update trl to 0.17.0 * grpo + vllm no longer supported with 2.5.1 due to vllm constraints * disable VLLM_USE_V1 for ci * imporve handle killing off of multiprocessing vllm service * debug why this doesn't run in CI * increase vllm wait time * increase timeout to 5min * upgrade to vllm 0.8.4 * dump out the vllm log for debugging * use debug logging * increase vllm start timeout * use NVL instead * disable torch compile cache * revert some commented checks now that grpo tests are fixed * increase vllm timeoout back to 5min	2025-04-27 19:19:53 -04:00
Wing Lian	f9c7c3bb72	don't use is_main_process during config validation (#2569 )	2025-04-26 14:14:52 -04:00
Wing Lian	caf5cb63ea	add e2e smoke test for using activation/gradient checkpointing with offload (#2565 ) * add e2e smoke test for using activation/gradient checkpointing with offload * disable duplicate code check for the test * fix relative import * seq len too small to test this dataset with packing * Fix checkpoint ptaching for tests	2025-04-25 21:11:17 -04:00
Dan Saunders	ae1c7ace63	Sequence parallel training context manager (#2553 ) * ctx manager for SP * updates * update * further simplifying * accommodate both training context managers * simplifying * simplifying * nit * reorg * tweak codecov yaml * add gather post hook, simplify, fixes * pytest * pytest fix	2025-04-25 10:33:54 -04:00
Wing Lian	1447beb132	make sure to validate the config before normalizing so defaults get set (#2554 ) * make sure to validate the config before normalizing so defaults get set * validation not needed for particular test * remove duplicate validations * set qlora correctly	2025-04-24 13:01:43 -04:00
Wing Lian	7651550850	make sure to download fixtures for kd test (#2541 ) * make sure to download fixtures for kd test * use same alpaca dataset	2025-04-21 10:31:50 -04:00
Wing Lian	b640db1dbc	don't run multigpu tests twice, run SP in separate test (#2542 ) * don't run multigpu tests twice, run SP in separate test * fix multiline	2025-04-21 10:24:13 -04:00
Wing Lian	69eda209a6	re-enable DS zero3 ci with updated transformers (#2533 )	2025-04-16 14:48:40 -07:00
Dan Saunders	b8c633aa97	batch api HF adapter for ring-flash-attn; cleanup and improvements (#2520 ) * batch api HF adapter for ring-flash-attn; cleanup and improvements * update * adding all batch ring-flash-attn methods via single adapter * removing pad_to_sequence_len=False for now * fix * updating docs to include batch SP * review comments * fixes for batch API funcs, simplify * fixes * fix * updates * add batch_zigzag smoke test	2025-04-16 13:50:48 -04:00

1 2 3 4

180 Commits