axolotl/tests at a6080df73c8ab991ec9ee2e000a2df1be14bb493 - axolotl - Gitea

tocmo0nlord/axolotl

Files

History

VED a6080df73c compute loss only if training and update token metric naming (#3293 ) [skip ci]

* compute loss only if training

* save total_tokens for checkpiont

* check if string

* refactor total_tokens/ num_tokens

* refactor 2

* rplc trainable_step/trian_per_sec_per_gpu

* lint + log trainable/tokens

* consolidate it in the callback.

* test for total_tokes aftr remuse

* check if tokenstate exist after ckpt

---------

Co-authored-by: Ved <ved.work2024@gmail.com>

2025-12-25 18:38:17 +07:00

..

Add ruff, remove black, isort, flake8, pylint (#3092 )

2025-08-23 23:37:33 -04:00

Distributed Muon Optimizer (#3264 )

2025-12-19 10:43:47 -05:00

compute loss only if training and update token metric naming (#3293 ) [skip ci]

2025-12-25 18:38:17 +07:00

Respect sequence_len in config for type: llama2_chat (#926 )

2023-12-12 09:39:22 -08:00

fix: Fix evaluation loss in KD trainer (#3271 )

2025-12-17 13:40:36 -05:00

Cp fix (#3182 )

2025-09-25 12:03:50 -04:00

Add ruff, remove black, isort, flake8, pylint (#3092 )

2025-08-23 23:37:33 -04:00

prompt_strategies

_get_tools in ChatTemplateStrategy : function "parameters" can be dict or string (#3238 )

2025-11-11 09:04:28 +07:00

feat: Add opt-out Telemetry (#3237 )

2025-11-18 11:35:25 +07:00

Distributed Muon Optimizer (#3264 )

2025-12-19 10:43:47 -05:00

__init__.py

fix: minor patches for multimodal (#2441 )

2025-03-31 13:40:12 +07:00

conftest.py

feat: Add opt-out Telemetry (#3237 )

2025-11-18 11:35:25 +07:00

constants.py

Add ruff, remove black, isort, flake8, pylint (#3092 )

2025-08-23 23:37:33 -04:00

hf_offline_utils.py

Add ruff, remove black, isort, flake8, pylint (#3092 )

2025-08-23 23:37:33 -04:00

test_chunked_xentropy.py

Add ruff, remove black, isort, flake8, pylint (#3092 )

2025-08-23 23:37:33 -04:00

test_data.py

feature: raise on long sequence drop (#3321 )

2025-12-22 13:59:49 -05:00

test_datasets.py

feature: raise on long sequence drop (#3321 )

2025-12-22 13:59:49 -05:00

test_dict.py

Add ruff, remove black, isort, flake8, pylint (#3092 )

2025-08-23 23:37:33 -04:00

test_exact_deduplication.py

feat:add support dataset_num_processes (#3129 ) [skip ci]

2025-10-13 17:18:12 +07:00

test_expand_mask.py

adding pre-commit auto-update GH action and bumping plugin versions (#2428 )

2025-03-21 11:02:43 -04:00

test_freeze.py

Train parameters exclusively in specific ranges (#1390 )

2024-03-14 11:05:42 -04:00

test_loaders.py

fix: transformers deprecate load_in_Xbit in model_kwargs (#3205 )

2025-10-16 16:07:27 +07:00

test_logging_config_file_capture.py

Debug log, logging improvements (#3159 )

2025-09-17 13:27:03 -04:00

test_lora.py

Add ruff, remove black, isort, flake8, pylint (#3092 )

2025-08-23 23:37:33 -04:00

test_normalize_config.py

FSDPConfig (#3170 )

2025-10-10 14:44:25 +01:00

test_opentelemetry_callback.py

Feat/opentelemetry (#3215 )

2025-10-22 19:16:55 -07:00

test_packed_batch_sampler.py

Add ruff, remove black, isort, flake8, pylint (#3092 )

2025-08-23 23:37:33 -04:00

test_packed_dataset.py

feat:add support dataset_num_processes (#3129 ) [skip ci]

2025-10-13 17:18:12 +07:00

test_packed_pretraining.py

Streaming SFT support (#3101 )

2025-09-02 12:08:44 -04:00

test_perplexity.py

Add ruff, remove black, isort, flake8, pylint (#3092 )

2025-08-23 23:37:33 -04:00

test_prompt_tokenizers.py

Add ruff, remove black, isort, flake8, pylint (#3092 )

2025-08-23 23:37:33 -04:00

test_prompters.py

fix: prompt phi (#1845 ) [skip ci]

2024-08-22 11:46:57 -04:00

test_schedulers.py

Add ruff, remove black, isort, flake8, pylint (#3092 )

2025-08-23 23:37:33 -04:00

test_streaming.py

text diffusion training plugin (#3067 )

2025-09-10 20:27:00 -04:00

test_tokenizers.py

models.py -> loaders/ module refactor (#2680 )

2025-05-23 15:51:11 -04:00

test_train.py

refactor dupes from merge/rebase (#2919 ) [skip ci]

2025-07-14 10:05:26 -04:00

test_utils_tee.py

Debug log, logging improvements (#3159 )

2025-09-17 13:27:03 -04:00

test_validation_dataset.py

Distributed Muon Optimizer (#3264 )

2025-12-19 10:43:47 -05:00