axolotl

Author	SHA1	Message	Date
Wing Lian	fc4e37920b	transformers v5 upgrade (#3272 ) * Prepare for transformers v5 upgrade * fix hf cli * update for hf hub changes * fix tokenizer apply_chat_template args * remap include_tokens_per_second * fix tps * handle migration for warmup * use latest hf hub * Fix scan -> ls * fix import * fix for renaming of mistral common tokenizer -> backend * update for fixed tokenziation for llama * Skip phi35 tests for now * remove mistral patch fixed upstream in huggingface/transformers#41439 * use namespacing for patch * don't rely on sdist for e2e tests for now * run modal ci without waiting too * Fix dep for ci * fix imports * Fix fp8 check * fsdp2 fixes * fix version handling * update fsdp version tests for new v5 behavior * Fail multigpu tests after 3 failures * skip known v5 broken tests for now and cleanup * bump deps * unmark skipped test * re-enable test_fsdp_qlora_prequant_packed test * increase multigpu ci timeout * skip broken gemma3 test * reduce timout back to original 120min now that the hanging test is skipped * fix for un-necessary collator for pretraining with bsz=1 * fix: safe_serialization deprecated in transformers v5 rc01 (#3318) * torch_dtype deprecated * load model in float32 for consistency with tests * revert some test fixtures back * use hf cache ls instead of scan * don't strip fsdp_version more fdsp_Version fixes for v5 fix version in fsdp_config fix aliasing fix fsdp_version check check fsdp_version is 2 in both places * Transformers v5 rc2 (#3347) * bump dep * use latest fbgemm, grab model config as part of fixture, un-skip test * import AutoConfig * don't need more problematic autoconfig when specifying config.json manually * add fixtures for argilla ultrafeedback datasets * download phi4-reasoning * fix arg * update tests for phi fast tokenizer changes * use explicit model types for gemma3 --------- Co-authored-by: Wing Lian <wing@axolotl.ai> * fix: AutoModelForVision2Seq -> AutoModelForImageTextToText * chore: remove duplicate * fix: attempt fix gemma3 text mode * chore: lint * ga release of v5 * need property setter for name_or_path for mistral tokenizer * vllm not compatible with transformers v5 * setter for chat_template w mistral too --------- Co-authored-by: NanoCode012 <nano@axolotl.ai> Co-authored-by: salman <salman.mohammadi@outlook.com>	2026-01-27 17:08:24 -05:00
salman	143dea4753	`FSDPConfig` (#3170 )	2025-10-10 14:44:25 +01:00
Wing Lian	af92151a7b	FSDP2 fix validation and add tests (#2910 ) * fix validation and add tests * remove debugging and add more tests * remove migrate_fsdp	2025-07-14 09:25:44 -04:00
salman	d6e4a611e5	FSDP1 -> FSDP2 (#2760 ) * FSDP2 args migration implementation This commit implements the migration to FSDP2 arguments including: - FSDP2 support with LoRA training - DPO integration with FSDP2 - Model loading fixes and refactoring - CPU offloading and PEFT handling - Test updates and CI improvements - Bug fixes for dtype errors and various edge cases	2025-07-12 15:18:01 +01:00
Wing Lian	40f4ea23ab	replace references to random 68m model w 135m smollm2 (#2570 ) [skip ci] * replace references to random 68m model w 135m smollm2 * use AutoTokenizer for smollm2	2025-04-28 10:08:07 -04:00
Dan Saunders	c907ac173e	adding pre-commit auto-update GH action and bumping plugin versions (#2428 ) * adding pre-commit auto-update GH action and bumping plugin versions * running updated pre-commit plugins * sorry to revert, but pylint complained * Update .pre-commit-config.yaml Co-authored-by: Wing Lian <wing.lian@gmail.com> --------- Co-authored-by: Dan Saunders <dan@axolotl.ai> Co-authored-by: Wing Lian <wing.lian@gmail.com>	2025-03-21 11:02:43 -04:00
Wing Lian	fd3b80716a	remove fastchat and sharegpt (#2021 ) * remove fastchat and sharegpt * remove imports * remove more fastchat imports * chore: remove unused functions * feat: remove sharegpt and deprecate from docs * chore: remove unused sharegpt checks * fix: remove sharegpt type from tests * feat: add sharegpt deprecation error * feat: update readme --------- Co-authored-by: NanoCode012 <nano@axolotl.ai>	2024-11-08 13:45:49 -05:00
Wing Lian	0f985e12fe	more fixes 20240228 (#1342 ) [skip ci] * add missing evals_per_epoch setting * more pydantic fixes * more fixes * move test from normalization to validation * increase eval size for sample packing tests	2024-02-28 12:57:45 -05:00
Wing Lian	782b6a4216	set fp16 to false if bf16, update bf16: auto in example YAMLs (#1122 ) [skip ci] * set fp16 to false if bf16, update bf16: auto in example YAMLs * unset fp16 so that it fallsback properly if bf16 isn't available * Update README.md [skip-ci] Co-authored-by: NanoCode012 <kevinvong@rocketmail.com> * test that bf16 disables fp16 --------- Co-authored-by: NanoCode012 <kevinvong@rocketmail.com>	2024-01-22 18:44:01 -05:00
Simon Hällqvist	086561326f	Enable or disable bf16 support based on availability (#1116 )	2024-01-14 12:06:56 -05:00
Wing Lian	0ce1a6594e	update sharegpt conversations when chatml chat template is set (#1075 ) [skip ci] * update sharegpt conversations when chatml chat template is set * add info log when updating sharegpt/chatml conversation	2024-01-10 00:49:07 -05:00
Wing Lian	2d8def68dc	simplify by removing duplicate base_model_config (#772 )	2023-10-23 01:42:38 -04:00
Wing Lian	ca84cca2c0	convert exponential notation lr to floats (#771 )	2023-10-22 15:37:03 -04:00

13 Commits