Built site for gh-pages

2026-02-19 23:34:25 +00:00
parent bcd14fb909
commit a3cdeab27e
5 changed files with 804 additions and 708 deletions
--- a/.nojekyll
+++ b/.nojekyll
@@ -1 +1 @@
-eac6727e
+8763ebce
--- a/docs/cli.html
+++ b/docs/cli.html
@@ -944,13 +944,15 @@ the CLI commands, their usage, and common examples.</p>
 <div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb14"><pre class="sourceCode bash code-with-copy"><code class="sourceCode bash"><span id="cb14-1"><a href="#cb14-1" aria-hidden="true" tabindex="-1"></a><span class="co"># Basic evaluation</span></span>
 <span id="cb14-2"><a href="#cb14-2" aria-hidden="true" tabindex="-1"></a><span class="ex">axolotl</span> lm-eval config.yml</span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>
 <p>Configuration options:</p>
-<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb15"><pre class="sourceCode yaml code-with-copy"><code class="sourceCode yaml"><span id="cb15-1"><a href="#cb15-1" aria-hidden="true" tabindex="-1"></a><span class="co"># List of tasks to evaluate</span></span>
-<span id="cb15-2"><a href="#cb15-2" aria-hidden="true" tabindex="-1"></a><span class="fu">lm_eval_tasks</span><span class="kw">:</span></span>
-<span id="cb15-3"><a href="#cb15-3" aria-hidden="true" tabindex="-1"></a><span class="at">  </span><span class="kw">-</span><span class="at"> arc_challenge</span></span>
-<span id="cb15-4"><a href="#cb15-4" aria-hidden="true" tabindex="-1"></a><span class="at">  </span><span class="kw">-</span><span class="at"> hellaswag</span></span>
-<span id="cb15-5"><a href="#cb15-5" aria-hidden="true" tabindex="-1"></a><span class="fu">lm_eval_batch_size</span><span class="kw">:</span><span class="co"> # Batch size for evaluation</span></span>
-<span id="cb15-6"><a href="#cb15-6" aria-hidden="true" tabindex="-1"></a><span class="fu">output_dir</span><span class="kw">:</span><span class="co"> # Directory to save evaluation results</span></span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>
-<p>See <a href="https://github.com/EleutherAI/lm-evaluation-harness">LM Eval Harness</a> for more details.</p>
+<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb15"><pre class="sourceCode yaml code-with-copy"><code class="sourceCode yaml"><span id="cb15-1"><a href="#cb15-1" aria-hidden="true" tabindex="-1"></a><span class="fu">lm_eval_model</span><span class="kw">:</span><span class="co"> # model to evaluate (local or hf path)</span></span>
+<span id="cb15-2"><a href="#cb15-2" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb15-3"><a href="#cb15-3" aria-hidden="true" tabindex="-1"></a><span class="co"># List of tasks to evaluate</span></span>
+<span id="cb15-4"><a href="#cb15-4" aria-hidden="true" tabindex="-1"></a><span class="fu">lm_eval_tasks</span><span class="kw">:</span></span>
+<span id="cb15-5"><a href="#cb15-5" aria-hidden="true" tabindex="-1"></a><span class="at">  </span><span class="kw">-</span><span class="at"> arc_challenge</span></span>
+<span id="cb15-6"><a href="#cb15-6" aria-hidden="true" tabindex="-1"></a><span class="at">  </span><span class="kw">-</span><span class="at"> hellaswag</span></span>
+<span id="cb15-7"><a href="#cb15-7" aria-hidden="true" tabindex="-1"></a><span class="fu">lm_eval_batch_size</span><span class="kw">:</span><span class="co"> # Batch size for evaluation</span></span>
+<span id="cb15-8"><a href="#cb15-8" aria-hidden="true" tabindex="-1"></a><span class="fu">output_dir</span><span class="kw">:</span><span class="co"> # Directory to save evaluation results</span></span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>
+<p>See <a href="https://docs.axolotl.ai/docs/custom_integrations.html#language-model-evaluation-harness-lm-eval">LM Eval Harness integration docs</a> for full configuration details.</p>
 </section>
 <section id="delinearize-llama4" class="level3">
 <h3 class="anchored" data-anchor-id="delinearize-llama4">delinearize-llama4</h3>
--- a/docs/custom_integrations.html
+++ b/docs/custom_integrations.html
--- a/search.json
+++ b/search.json
@@ -44,7 +44,7 @@
    "href": "docs/cli.html#command-reference",
    "title": "Command Line Interface (CLI)",
    "section": "Command Reference",
-    "text": "Command Reference\n\nfetch\nDownloads example configurations and deepspeed configs to your local machine.\n# Get example YAML files\naxolotl fetch examples\n\n# Get deepspeed config files\naxolotl fetch deepspeed_configs\n\n# Specify custom destination\naxolotl fetch examples --dest path/to/folder\n\n\npreprocess\nPreprocesses and tokenizes your dataset before training. This is recommended for large datasets.\n# Basic preprocessing\naxolotl preprocess config.yml\n\n# Preprocessing with one GPU\nCUDA_VISIBLE_DEVICES=\"0\" axolotl preprocess config.yml\n\n# Debug mode to see processed examples\naxolotl preprocess config.yml --debug\n\n# Debug with limited examples\naxolotl preprocess config.yml --debug --debug-num-examples 5\nConfiguration options:\ndataset_prepared_path: Local folder for saving preprocessed data\npush_dataset_to_hub: HuggingFace repo to push preprocessed data (optional)\n\n\ntrain\nTrains or fine-tunes a model using the configuration specified in your YAML file.\n# Basic training\naxolotl train config.yml\n\n# Train and set/override specific options\naxolotl train config.yml \\\n    --learning-rate 1e-4 \\\n    --micro-batch-size 2 \\\n    --num-epochs 3\n\n# Training without accelerate\naxolotl train config.yml --launcher python\n\n# Pass launcher-specific arguments using -- separator\naxolotl train config.yml --launcher torchrun -- --nproc_per_node=2 --nnodes=1\naxolotl train config.yml --launcher accelerate -- --config_file=accelerate_config.yml\n\n# Resume training from checkpoint\naxolotl train config.yml --resume-from-checkpoint path/to/checkpoint\nIt is possible to run sweeps over multiple hyperparameters by passing in a sweeps config.\n# Basic training with sweeps\naxolotl train config.yml --sweep path/to/sweep.yaml\nExample sweep config:\n_:\n  # This section is for dependent variables we need to fix\n  - load_in_8bit: false\n    load_in_4bit: false\n    adapter: lora\n  - load_in_8bit: true\n    load_in_4bit: false\n    adapter: lora\n\n# These are independent variables\nlearning_rate: [0.0003, 0.0006]\nlora_r:\n  - 16\n  - 32\nlora_alpha:\n  - 16\n  - 32\n  - 64\n\n\ninference\nRuns inference using your trained model in either CLI or Gradio interface mode.\n# CLI inference with LoRA\naxolotl inference config.yml --lora-model-dir=\"./outputs/lora-out\"\n\n# CLI inference with full model\naxolotl inference config.yml --base-model=\"./completed-model\"\n\n# Gradio web interface\naxolotl inference config.yml --gradio \\\n    --lora-model-dir=\"./outputs/lora-out\"\n\n# Inference with input from file\ncat prompt.txt | axolotl inference config.yml \\\n    --base-model=\"./completed-model\"\n\n\nmerge-lora\nMerges trained LoRA adapters into the base model.\n# Basic merge\naxolotl merge-lora config.yml\n\n# Specify LoRA directory (usually used with checkpoints)\naxolotl merge-lora config.yml --lora-model-dir=\"./lora-output/checkpoint-100\"\n\n# Merge using CPU (if out of GPU memory)\nCUDA_VISIBLE_DEVICES=\"\" axolotl merge-lora config.yml\nConfiguration options:\ngpu_memory_limit: Limit GPU memory usage\nlora_on_cpu: Load LoRA weights on CPU\n\n\nmerge-sharded-fsdp-weights\nMerges sharded FSDP model checkpoints into a single combined checkpoint.\n# Basic merge\naxolotl merge-sharded-fsdp-weights config.yml\n\n\nevaluate\nEvaluates a model’s performance (loss etc) on the train and eval datasets.\n# Basic evaluation\naxolotl evaluate config.yml\n\n# Evaluation with launcher arguments\naxolotl evaluate config.yml --launcher torchrun -- --nproc_per_node=2\n\n\nlm-eval\nRuns LM Evaluation Harness on your model.\n# Basic evaluation\naxolotl lm-eval config.yml\nConfiguration options:\n# List of tasks to evaluate\nlm_eval_tasks:\n  - arc_challenge\n  - hellaswag\nlm_eval_batch_size: # Batch size for evaluation\noutput_dir: # Directory to save evaluation results\nSee LM Eval Harness for more details.\n\n\ndelinearize-llama4\nDelinearizes a Llama 4 linearized model into a regular HuggingFace Llama 4 model. This only works with the non-quantized linearized model.\naxolotl delinearize-llama4 --model path/to/model_dir --output path/to/output_dir\nThis would be necessary to use with other frameworks. If you have an adapter, merge it with the non-quantized linearized model before delinearizing.\n\n\nquantize\nQuantizes a model using the quantization configuration specified in your YAML file.\naxolotl quantize config.yml\nSee Quantization for more details.",
+    "text": "Command Reference\n\nfetch\nDownloads example configurations and deepspeed configs to your local machine.\n# Get example YAML files\naxolotl fetch examples\n\n# Get deepspeed config files\naxolotl fetch deepspeed_configs\n\n# Specify custom destination\naxolotl fetch examples --dest path/to/folder\n\n\npreprocess\nPreprocesses and tokenizes your dataset before training. This is recommended for large datasets.\n# Basic preprocessing\naxolotl preprocess config.yml\n\n# Preprocessing with one GPU\nCUDA_VISIBLE_DEVICES=\"0\" axolotl preprocess config.yml\n\n# Debug mode to see processed examples\naxolotl preprocess config.yml --debug\n\n# Debug with limited examples\naxolotl preprocess config.yml --debug --debug-num-examples 5\nConfiguration options:\ndataset_prepared_path: Local folder for saving preprocessed data\npush_dataset_to_hub: HuggingFace repo to push preprocessed data (optional)\n\n\ntrain\nTrains or fine-tunes a model using the configuration specified in your YAML file.\n# Basic training\naxolotl train config.yml\n\n# Train and set/override specific options\naxolotl train config.yml \\\n    --learning-rate 1e-4 \\\n    --micro-batch-size 2 \\\n    --num-epochs 3\n\n# Training without accelerate\naxolotl train config.yml --launcher python\n\n# Pass launcher-specific arguments using -- separator\naxolotl train config.yml --launcher torchrun -- --nproc_per_node=2 --nnodes=1\naxolotl train config.yml --launcher accelerate -- --config_file=accelerate_config.yml\n\n# Resume training from checkpoint\naxolotl train config.yml --resume-from-checkpoint path/to/checkpoint\nIt is possible to run sweeps over multiple hyperparameters by passing in a sweeps config.\n# Basic training with sweeps\naxolotl train config.yml --sweep path/to/sweep.yaml\nExample sweep config:\n_:\n  # This section is for dependent variables we need to fix\n  - load_in_8bit: false\n    load_in_4bit: false\n    adapter: lora\n  - load_in_8bit: true\n    load_in_4bit: false\n    adapter: lora\n\n# These are independent variables\nlearning_rate: [0.0003, 0.0006]\nlora_r:\n  - 16\n  - 32\nlora_alpha:\n  - 16\n  - 32\n  - 64\n\n\ninference\nRuns inference using your trained model in either CLI or Gradio interface mode.\n# CLI inference with LoRA\naxolotl inference config.yml --lora-model-dir=\"./outputs/lora-out\"\n\n# CLI inference with full model\naxolotl inference config.yml --base-model=\"./completed-model\"\n\n# Gradio web interface\naxolotl inference config.yml --gradio \\\n    --lora-model-dir=\"./outputs/lora-out\"\n\n# Inference with input from file\ncat prompt.txt | axolotl inference config.yml \\\n    --base-model=\"./completed-model\"\n\n\nmerge-lora\nMerges trained LoRA adapters into the base model.\n# Basic merge\naxolotl merge-lora config.yml\n\n# Specify LoRA directory (usually used with checkpoints)\naxolotl merge-lora config.yml --lora-model-dir=\"./lora-output/checkpoint-100\"\n\n# Merge using CPU (if out of GPU memory)\nCUDA_VISIBLE_DEVICES=\"\" axolotl merge-lora config.yml\nConfiguration options:\ngpu_memory_limit: Limit GPU memory usage\nlora_on_cpu: Load LoRA weights on CPU\n\n\nmerge-sharded-fsdp-weights\nMerges sharded FSDP model checkpoints into a single combined checkpoint.\n# Basic merge\naxolotl merge-sharded-fsdp-weights config.yml\n\n\nevaluate\nEvaluates a model’s performance (loss etc) on the train and eval datasets.\n# Basic evaluation\naxolotl evaluate config.yml\n\n# Evaluation with launcher arguments\naxolotl evaluate config.yml --launcher torchrun -- --nproc_per_node=2\n\n\nlm-eval\nRuns LM Evaluation Harness on your model.\n# Basic evaluation\naxolotl lm-eval config.yml\nConfiguration options:\nlm_eval_model: # model to evaluate (local or hf path)\n\n# List of tasks to evaluate\nlm_eval_tasks:\n  - arc_challenge\n  - hellaswag\nlm_eval_batch_size: # Batch size for evaluation\noutput_dir: # Directory to save evaluation results\nSee LM Eval Harness integration docs for full configuration details.\n\n\ndelinearize-llama4\nDelinearizes a Llama 4 linearized model into a regular HuggingFace Llama 4 model. This only works with the non-quantized linearized model.\naxolotl delinearize-llama4 --model path/to/model_dir --output path/to/output_dir\nThis would be necessary to use with other frameworks. If you have an adapter, merge it with the non-quantized linearized model before delinearizing.\n\n\nquantize\nQuantizes a model using the quantization configuration specified in your YAML file.\naxolotl quantize config.yml\nSee Quantization for more details.",
    "crumbs": [
      "Getting Started",
      "Command Line Interface (CLI)"
@@ -3231,6 +3231,17 @@
      "Custom Integrations"
    ]
  },
+  {
+    "objectID": "docs/custom_integrations.html#kernels-integration",
+    "href": "docs/custom_integrations.html#kernels-integration",
+    "title": "Custom Integrations",
+    "section": "Kernels Integration",
+    "text": "Kernels Integration\nMoE (Mixture of Experts) kernels speed up training for MoE layers and reduce VRAM costs. In transformers v5, batched_mm and grouped_mm were integrated as built-in options via the experts_implementation config kwarg:\nclass ExpertsInterface(GeneralInterface):\n    _global_mapping = {\n        \"batched_mm\": batched_mm_experts_forward,\n        \"grouped_mm\": grouped_mm_experts_forward,\n    }\nIn our custom integration, we add support for ScatterMoE, which is even more efficient and faster than grouped_mm.\n\nUsage\nAdd the following to your axolotl YAML config:\nplugins:\n  - axolotl.integrations.kernels.KernelsPlugin\n\nuse_kernels: true\nuse_scattermoe: true\nImportant: Setting experts_implementation is incompatible with use_scattermoe.\n\n\nHow It Works\nThe KernelsPlugin runs before model loading and:\n\nRegisters the ScatterMoE kernel from the axolotl-ai-co/scattermoe Hub repo.\nPatches the model’s SparseMoeBlock forward method with the optimized ScatterMoE implementation.\n\nThis works for any MoE model in transformers that uses a SparseMoeBlock class (Mixtral, Qwen2-MoE, OLMoE, etc.).\n\n\nLimitations\nScatterMoE uses a softmax -&gt; topk routing, so results may be different for some model arch as baseline (GPT-OSS, GLM_MOE_DSA).\n\n\nNote on MegaBlocks\nWe tested MegaBlocks but were unable to ensure numerical accuracy, so we did not integrate it. It was also incompatible with many newer model architectures in transformers.\nPlease see reference here",
+    "crumbs": [
+      "Advanced Features",
+      "Custom Integrations"
+    ]
+  },
  {
    "objectID": "docs/custom_integrations.html#knowledge-distillation-kd",
    "href": "docs/custom_integrations.html#knowledge-distillation-kd",
@@ -3258,7 +3269,7 @@
    "href": "docs/custom_integrations.html#language-model-evaluation-harness-lm-eval",
    "title": "Custom Integrations",
    "section": "Language Model Evaluation Harness (LM Eval)",
-    "text": "Language Model Evaluation Harness (LM Eval)\nRun evaluation on model using the popular lm-evaluation-harness library.\nSee https://github.com/EleutherAI/lm-evaluation-harness\n\nUsage\nplugins:\n  - axolotl.integrations.lm_eval.LMEvalPlugin\n\nlm_eval_tasks:\n  - gsm8k\n  - hellaswag\n  - arc_easy\n\nlm_eval_batch_size: # Batch size for evaluation\noutput_dir: # Directory to save evaluation results\n\n\nCitation\n@misc{eval-harness,\n  author       = {Gao, Leo and Tow, Jonathan and Abbasi, Baber and Biderman, Stella and Black, Sid and DiPofi, Anthony and Foster, Charles and Golding, Laurence and Hsu, Jeffrey and Le Noac'h, Alain and Li, Haonan and McDonell, Kyle and Muennighoff, Niklas and Ociepa, Chris and Phang, Jason and Reynolds, Laria and Schoelkopf, Hailey and Skowron, Aviya and Sutawika, Lintang and Tang, Eric and Thite, Anish and Wang, Ben and Wang, Kevin and Zou, Andy},\n  title        = {A framework for few-shot language model evaluation},\n  month        = 07,\n  year         = 2024,\n  publisher    = {Zenodo},\n  version      = {v0.4.3},\n  doi          = {10.5281/zenodo.12608602},\n  url          = {https://zenodo.org/records/12608602}\n}\nPlease see reference here",
+    "text": "Language Model Evaluation Harness (LM Eval)\nRun evaluation on model using the popular lm-evaluation-harness library.\nSee https://github.com/EleutherAI/lm-evaluation-harness\n\nUsage\nThere are two ways to use the LM Eval integration:\n\n\n1. Post-Training Evaluation\nWhen training with the plugin enabled, evaluation runs automatically after training completes:\nplugins:\n  - axolotl.integrations.lm_eval.LMEvalPlugin\n\nlm_eval_tasks:\n  - gsm8k\n  - hellaswag\n  - arc_easy\n\nlm_eval_batch_size: # Batch size for evaluation\n\noutput_dir:\nRun training as usual:\naxolotl train config.yml\n\n\n2. Standalone CLI Evaluation\nEvaluate any model directly without training:\nlm_eval_model: meta-llama/Llama-2-7b-hf\n\nplugins:\n  - axolotl.integrations.lm_eval.LMEvalPlugin\n\nlm_eval_tasks:\n  - gsm8k\n  - hellaswag\n  - arc_easy\n\nlm_eval_batch_size: 8\noutput_dir: ./outputs\nRun evaluation:\naxolotl lm-eval config.yml\n\n\nModel Selection Priority\nThe model to evaluate is selected in the following priority order:\n\nlm_eval_model - Explicit model path or HuggingFace repo (highest priority)\nhub_model_id - Trained model pushed to HuggingFace Hub\noutput_dir - Local checkpoint directory containing trained model weights\n\n\n\nCitation\n@misc{eval-harness,\n  author       = {Gao, Leo and Tow, Jonathan and Abbasi, Baber and Biderman, Stella and Black, Sid and DiPofi, Anthony and Foster, Charles and Golding, Laurence and Hsu, Jeffrey and Le Noac'h, Alain and Li, Haonan and McDonell, Kyle and Muennighoff, Niklas and Ociepa, Chris and Phang, Jason and Reynolds, Laria and Schoelkopf, Hailey and Skowron, Aviya and Sutawika, Lintang and Tang, Eric and Thite, Anish and Wang, Ben and Wang, Kevin and Zou, Andy},\n  title        = {A framework for few-shot language model evaluation},\n  month        = 07,\n  year         = 2024,\n  publisher    = {Zenodo},\n  version      = {v0.4.3},\n  doi          = {10.5281/zenodo.12608602},\n  url          = {https://zenodo.org/records/12608602}\n}\nPlease see reference here",
    "crumbs": [
      "Advanced Features",
      "Custom Integrations"
--- a/sitemap.xml
+++ b/sitemap.xml