axolotl/tests at 7fd3d8abc436843d0a22cb5025fdfa791ff8eaa0 - axolotl - Gitea

tocmo0nlord/axolotl

Files

History

Wing Lian ecbe8b2b61 [GPT-OSS] improve FSDP shard merging and documentation for GPT-OSS (#3073 )

* improve fsdp shard merging

* improve logging

* update information on merging and inferencing GPT-OSS

* cleanup readme

* automate cleanup of FSDP prefix

* import GRPO only if necessary

* only modify config.json on rank0

* merge final checkpoint at end of training

* prevent circular import

* Fix saving for sharded state dict

* devx, move merged to output dir

* move import back to top

* Fix stuck merge

* fix conditionals from pr feedback and add test

2025-08-15 21:25:01 -04:00

..

use exec instead of subprocess to make ctrl+c nicer for cli (#3044 )

2025-08-10 20:22:20 -04:00

update training args check for new defaults (#3051 ) [skip ci]

2025-08-10 11:26:22 -04:00

Various fixes for VLMs (#3063 )

2025-08-15 10:52:57 -04:00

Respect sequence_len in config for type: llama2_chat (#926 )

2023-12-12 09:39:22 -08:00

remove deprecated wandb env var (#2751 )

2025-06-03 14:04:15 -07:00

use nanmean for loss aggregation (CP fix) (#3033 )

2025-08-08 08:15:17 -04:00

release v0.11.0 (#2875 )

2025-07-09 09:22:35 -04:00

prompt_strategies

Feat: Add voxtral, magistral small 1.1, and misc gemma3n fixes (#2979 )

2025-07-30 15:57:05 +07:00

[GPT-OSS] improve FSDP shard merging and documentation for GPT-OSS (#3073 )

2025-08-15 21:25:01 -04:00

__init__.py

fix: minor patches for multimodal (#2441 )

2025-03-31 13:40:12 +07:00

conftest.py

refactor dupes from merge/rebase (#2919 ) [skip ci]

2025-07-14 10:05:26 -04:00

constants.py

Add Exact Deduplication Feature to Preprocessing Pipeline (#2072 )

2024-12-02 08:47:10 -05:00

hf_offline_utils.py

fix tokenizer overrides w gemma3 (#2488 )

2025-04-05 01:25:44 -04:00

test_chunked_xentropy.py

chunked cross entropy loss (#2625 )

2025-06-23 23:08:46 -04:00

test_data.py

fix: minor patches for multimodal (#2441 )

2025-03-31 13:40:12 +07:00

test_datasets.py

limit num_proc when saving datasets to disk (#2948 ) [skip ci]

2025-07-21 11:39:38 -04:00

test_dict.py

adding pre-commit auto-update GH action and bumping plugin versions (#2428 )

2025-03-21 11:02:43 -04:00

test_exact_deduplication.py

limit num_proc when saving datasets to disk (#2948 ) [skip ci]

2025-07-21 11:39:38 -04:00

test_expand_mask.py

adding pre-commit auto-update GH action and bumping plugin versions (#2428 )

2025-03-21 11:02:43 -04:00

test_freeze.py

Train parameters exclusively in specific ranges (#1390 )

2024-03-14 11:05:42 -04:00

test_loaders.py

Add support for Accelerate CP, ND examples, and fix for parallel config w fsdp (#3019 )

2025-08-07 21:22:15 -04:00

test_lora.py

models.py -> loaders/ module refactor (#2680 )

2025-05-23 15:51:11 -04:00

test_normalize_config.py

FSDP2 fix validation and add tests (#2910 )

2025-07-14 09:25:44 -04:00

test_packed_batch_sampler.py

allow for different sequence_len for evaluations (#2836 ) [skip ci]

2025-06-27 11:02:51 -04:00

test_packed_dataset.py

limit num_proc when saving datasets to disk (#2948 ) [skip ci]

2025-07-21 11:39:38 -04:00

test_packed_pretraining.py

fix streaming packing test (#2454 )

2025-03-29 08:30:06 -04:00

test_perplexity.py

adding pre-commit auto-update GH action and bumping plugin versions (#2428 )

2025-03-21 11:02:43 -04:00

test_prompt_tokenizers.py

remove deprecated wandb env var (#2751 )

2025-06-03 14:04:15 -07:00

test_prompters.py

fix: prompt phi (#1845 ) [skip ci]

2024-08-22 11:46:57 -04:00

test_schedulers.py

adding pre-commit auto-update GH action and bumping plugin versions (#2428 )

2025-03-21 11:02:43 -04:00

test_tokenizers.py

models.py -> loaders/ module refactor (#2680 )

2025-05-23 15:51:11 -04:00

test_train.py

refactor dupes from merge/rebase (#2919 ) [skip ci]

2025-07-14 10:05:26 -04:00

test_validation_dataset.py

release v0.11.0 (#2875 )

2025-07-09 09:22:35 -04:00