From 5724ca4e579f25806a675a7909edd952cf2e1530 Mon Sep 17 00:00:00 2001 From: Quarto GHA Workflow Runner Date: Thu, 2 Apr 2026 12:08:47 +0000 Subject: [PATCH] Built site for gh-pages --- .nojekyll | 2 +- FAQS.html | 30 + docs/agents/grpo.html | 1322 ++++++++++ docs/agents/preference_tuning.html | 1449 +++++++++++ docs/agents/pretraining.html | 1319 ++++++++++ docs/agents/reward_modelling.html | 1256 +++++++++ docs/agents/sft.html | 1443 +++++++++++ docs/amd_hpc.html | 30 + docs/api/cli.args.html | 30 + docs/api/cli.art.html | 30 + docs/api/cli.checks.html | 30 + docs/api/cli.cloud.base.html | 30 + docs/api/cli.cloud.modal_.html | 30 + docs/api/cli.config.html | 30 + docs/api/cli.delinearize_llama4.html | 30 + docs/api/cli.evaluate.html | 30 + docs/api/cli.inference.html | 30 + docs/api/cli.main.html | 30 + docs/api/cli.merge_lora.html | 30 + docs/api/cli.merge_sharded_fsdp_weights.html | 30 + docs/api/cli.preprocess.html | 30 + docs/api/cli.quantize.html | 30 + docs/api/cli.train.html | 30 + docs/api/cli.utils.args.html | 30 + docs/api/cli.utils.fetch.html | 30 + docs/api/cli.utils.html | 30 + docs/api/cli.utils.load.html | 30 + docs/api/cli.utils.sweeps.html | 30 + docs/api/cli.utils.train.html | 30 + docs/api/cli.vllm_serve.html | 30 + docs/api/common.architectures.html | 30 + docs/api/common.const.html | 30 + docs/api/common.datasets.html | 30 + docs/api/convert.html | 30 + docs/api/core.builders.base.html | 30 + docs/api/core.builders.causal.html | 30 + docs/api/core.builders.rl.html | 30 + docs/api/core.chat.format.chatml.html | 30 + docs/api/core.chat.format.llama3x.html | 30 + docs/api/core.chat.format.shared.html | 30 + docs/api/core.chat.messages.html | 30 + docs/api/core.datasets.chat.html | 30 + ...core.datasets.transforms.chat_builder.html | 30 + docs/api/core.trainers.base.html | 30 + docs/api/core.trainers.dpo.trainer.html | 30 + docs/api/core.trainers.grpo.sampler.html | 30 + docs/api/core.trainers.grpo.trainer.html | 30 + docs/api/core.trainers.mamba.html | 30 + docs/api/core.trainers.mixins.optimizer.html | 30 + ...core.trainers.mixins.rng_state_loader.html | 30 + docs/api/core.trainers.mixins.scheduler.html | 30 + docs/api/core.trainers.trl.html | 30 + docs/api/core.trainers.utils.html | 30 + docs/api/core.training_args.html | 30 + docs/api/datasets.html | 30 + docs/api/evaluate.html | 30 + docs/api/index.html | 30 + docs/api/integrations.base.html | 30 + .../integrations.cut_cross_entropy.args.html | 30 + docs/api/integrations.grokfast.optimizer.html | 30 + docs/api/integrations.kd.trainer.html | 30 + docs/api/integrations.liger.args.html | 30 + docs/api/integrations.lm_eval.args.html | 30 + docs/api/integrations.spectrum.args.html | 30 + docs/api/kernels.geglu.html | 30 + docs/api/kernels.lora.html | 30 + docs/api/kernels.quantize.html | 30 + docs/api/kernels.swiglu.html | 30 + docs/api/kernels.utils.html | 30 + docs/api/loaders.adapter.html | 30 + docs/api/loaders.constants.html | 30 + docs/api/loaders.model.html | 30 + docs/api/loaders.patch_manager.html | 30 + docs/api/loaders.processor.html | 30 + docs/api/loaders.tokenizer.html | 30 + docs/api/logging_config.html | 30 + docs/api/models.mamba.modeling_mamba.html | 30 + .../monkeypatch.btlm_attn_hijack_flash.html | 30 + ...onkeypatch.data.batch_dataset_fetcher.html | 30 + ...ch.gradient_checkpointing.offload_cpu.html | 30 + ...h.gradient_checkpointing.offload_disk.html | 30 + .../monkeypatch.llama_attn_hijack_flash.html | 30 + ...onkeypatch.llama_attn_hijack_xformers.html | 30 + docs/api/monkeypatch.lora_kernels.html | 30 + ...monkeypatch.mistral_attn_hijack_flash.html | 30 + docs/api/monkeypatch.mixtral.html | 30 + docs/api/monkeypatch.multipack.html | 30 + docs/api/monkeypatch.relora.html | 30 + ...onkeypatch.stablelm_attn_hijack_flash.html | 30 + docs/api/monkeypatch.trainer_fsdp_optim.html | 30 + .../monkeypatch.transformers_fa_utils.html | 30 + docs/api/monkeypatch.unsloth_.html | 30 + docs/api/monkeypatch.utils.html | 30 + docs/api/prompt_strategies.alpaca_chat.html | 30 + .../prompt_strategies.alpaca_instruct.html | 30 + .../prompt_strategies.alpaca_w_system.html | 30 + docs/api/prompt_strategies.base.html | 30 + ...rompt_strategies.bradley_terry.llama3.html | 30 + docs/api/prompt_strategies.chat_template.html | 30 + docs/api/prompt_strategies.completion.html | 30 + .../prompt_strategies.dpo.chat_template.html | 30 + docs/api/prompt_strategies.dpo.chatml.html | 30 + docs/api/prompt_strategies.dpo.llama3.html | 30 + .../prompt_strategies.dpo.passthrough.html | 30 + .../prompt_strategies.dpo.user_defined.html | 30 + docs/api/prompt_strategies.dpo.zephyr.html | 30 + docs/api/prompt_strategies.input_output.html | 30 + docs/api/prompt_strategies.kto.chatml.html | 30 + docs/api/prompt_strategies.kto.llama3.html | 30 + .../prompt_strategies.kto.user_defined.html | 30 + docs/api/prompt_strategies.llama2_chat.html | 30 + docs/api/prompt_strategies.messages.chat.html | 30 + docs/api/prompt_strategies.metharme.html | 30 + docs/api/prompt_strategies.orcamini.html | 30 + .../prompt_strategies.orpo.chat_template.html | 30 + docs/api/prompt_strategies.pygmalion.html | 30 + ...prompt_strategies.stepwise_supervised.html | 30 + docs/api/prompt_strategies.user_defined.html | 30 + docs/api/prompt_tokenizers.html | 30 + docs/api/train.html | 30 + docs/api/utils.bench.html | 30 + docs/api/utils.callbacks.comet_.html | 30 + docs/api/utils.callbacks.lisa.html | 30 + docs/api/utils.callbacks.mlflow_.html | 30 + docs/api/utils.callbacks.perplexity.html | 30 + docs/api/utils.callbacks.profiler.html | 30 + docs/api/utils.callbacks.qat.html | 30 + docs/api/utils.chat_templates.html | 30 + docs/api/utils.collators.batching.html | 30 + docs/api/utils.collators.core.html | 30 + docs/api/utils.collators.mamba.html | 30 + docs/api/utils.collators.mm_chat.html | 30 + .../utils.ctx_managers.sequence_parallel.html | 30 + docs/api/utils.data.sft.html | 30 + docs/api/utils.data.streaming.html | 30 + docs/api/utils.dict.html | 30 + docs/api/utils.distributed.html | 30 + docs/api/utils.freeze.html | 30 + docs/api/utils.lora.html | 30 + docs/api/utils.model_shard_quant.html | 30 + docs/api/utils.optimizers.adopt.html | 30 + docs/api/utils.quantization.html | 30 + docs/api/utils.samplers.multipack.html | 30 + docs/api/utils.schedulers.html | 30 + docs/api/utils.schemas.config.html | 30 + docs/api/utils.schemas.datasets.html | 30 + docs/api/utils.schemas.enums.html | 30 + docs/api/utils.schemas.integrations.html | 30 + docs/api/utils.schemas.model.html | 30 + docs/api/utils.schemas.multimodal.html | 30 + docs/api/utils.schemas.peft.html | 30 + docs/api/utils.schemas.training.html | 30 + docs/api/utils.schemas.trl.html | 30 + docs/api/utils.schemas.utils.html | 30 + docs/api/utils.tokenization.html | 30 + docs/api/utils.trainer.html | 30 + docs/attention.html | 30 + docs/batch_vs_grad.html | 30 + docs/checkpoint_saving.html | 30 + docs/choosing_method.html | 1725 +++++++++++++ docs/cli.html | 30 + docs/config-reference.html | 30 + docs/custom_integrations.html | 30 + docs/dataset-formats/conversation.html | 30 + docs/dataset-formats/index.html | 389 +-- docs/dataset-formats/inst_tune.html | 30 + docs/dataset-formats/pretraining.html | 95 +- docs/dataset-formats/stepwise_supervised.html | 30 + docs/dataset-formats/template_free.html | 30 + docs/dataset-formats/tokenized.html | 30 + docs/dataset_loading.html | 30 + docs/dataset_preprocessing.html | 30 + docs/debugging.html | 47 +- docs/docker.html | 30 + docs/ebft.html | 2269 ++++++++++++++++ docs/expert_quantization.html | 30 + docs/faq.html | 30 + docs/fsdp_qlora.html | 30 + docs/getting-started.html | 60 +- docs/gradient_checkpointing.html | 30 + docs/grpo.html | 2296 +++++++++++++++++ docs/inference.html | 30 + docs/input_output.html | 30 + docs/installation.html | 30 + docs/lora_optims.html | 30 + docs/lr_groups.html | 30 + docs/mac.html | 30 + docs/mixed_precision.html | 30 + docs/models/LiquidAI.html | 30 + docs/models/apertus.html | 30 + docs/models/arcee.html | 30 + docs/models/devstral.html | 30 + docs/models/gemma3n.html | 30 + docs/models/gpt-oss.html | 30 + docs/models/granite4.html | 30 + docs/models/hunyuan.html | 30 + docs/models/index.html | 30 + docs/models/internvl3_5.html | 30 + docs/models/jamba.html | 30 + docs/models/kimi-linear.html | 30 + docs/models/llama-2.html | 30 + docs/models/llama-4.html | 30 + docs/models/magistral.html | 30 + docs/models/magistral/think.html | 30 + docs/models/magistral/vision.html | 30 + docs/models/mimo.html | 30 + docs/models/ministral.html | 30 + docs/models/ministral3.html | 30 + docs/models/ministral3/think.html | 30 + docs/models/ministral3/vision.html | 30 + docs/models/mistral-small.html | 30 + docs/models/mistral.html | 30 + docs/models/olmo3.html | 30 + docs/models/orpheus.html | 30 + docs/models/phi.html | 30 + docs/models/plano.html | 30 + docs/models/qwen3-next.html | 30 + docs/models/qwen3.html | 30 + docs/models/seed-oss.html | 30 + docs/models/smolvlm2.html | 30 + docs/models/trinity.html | 30 + docs/models/voxtral.html | 30 + docs/multi-gpu.html | 30 + docs/multi-node.html | 30 + docs/multimodal.html | 30 + docs/multipack.html | 30 + docs/nccl.html | 30 + docs/nd_parallelism.html | 30 + docs/optimizations.html | 30 + docs/optimizers.html | 30 + docs/qat.html | 30 + docs/quantize.html | 30 + docs/ray-integration.html | 30 + docs/reward_modelling.html | 30 + docs/rlhf.html | 52 +- docs/sequence_parallelism.html | 30 + docs/streaming.html | 30 + docs/telemetry.html | 30 + docs/torchao.html | 30 + docs/training_stability.html | 1838 +++++++++++++ docs/unsloth.html | 30 + docs/vllm_serving.html | 1825 +++++++++++++ .../colab-axolotl-example.html | 30 + index.html | 30 + search.json | 1301 ++++++++-- sitemap.xml | 948 +++---- src/axolotl/integrations/LICENSE.html | 30 + .../cut_cross_entropy/ACKNOWLEDGEMENTS.html | 30 + 248 files changed, 25536 insertions(+), 1000 deletions(-) create mode 100644 docs/agents/grpo.html create mode 100644 docs/agents/preference_tuning.html create mode 100644 docs/agents/pretraining.html create mode 100644 docs/agents/reward_modelling.html create mode 100644 docs/agents/sft.html create mode 100644 docs/choosing_method.html create mode 100644 docs/ebft.html create mode 100644 docs/grpo.html create mode 100644 docs/training_stability.html create mode 100644 docs/vllm_serving.html diff --git a/.nojekyll b/.nojekyll index 04adbec1a..c2a64ff12 100644 --- a/.nojekyll +++ b/.nojekyll @@ -1 +1 @@ -f05ef313 \ No newline at end of file +17703de0 \ No newline at end of file diff --git a/FAQS.html b/FAQS.html index d81059f09..82918ae5b 100644 --- a/FAQS.html +++ b/FAQS.html @@ -141,6 +141,12 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true}); Quickstart + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
  • Instruction Dataset +

    For help choosing between these methods, see Choosing a Fine-Tuning Method.

    RLHF using Axolotl

    @@ -1310,7 +1341,7 @@ Tip
    -

    Check out our GRPO cookbook.

    +

    Check out our GRPO cookbook. For a comprehensive guide covering async training, custom rewards, importance sampling, and scaling, see the GRPO deep dive.

    In the latest GRPO implementation, vLLM is used to significantly speedup trajectory generation during training. In this example, we’re using 4 GPUs - 2 for training, and 2 for vLLM:

    @@ -1683,7 +1714,7 @@ Note CUDA_VISIBLE_DEVICES=0 axolotl vllm-serve config.yaml # Terminal 2: Train on GPUs 0,1 -CUDA_VISIBLE_DEVICES=0,1 accelerate launch --num_processes 2 -m axolotl.cli.train config.yaml
    +CUDA_VISIBLE_DEVICES=0,1 axolotl train config.yaml
    @@ -1823,6 +1854,19 @@ Tip

    EBFT

    +
    +
    +
    + +
    +
    +Tip +
    +
    +
    +

    For a detailed guide on EBFT modes, feature extraction, and configuration, see the EBFT guide.

    +
    +

    EBFT (Energy-Based Fine-Tuning) fine-tunes language models by optimizing a feature-matching loss rather than relying on external reward functions. A frozen copy of the model extracts embeddings from both generated and ground-truth completions, and the generator is updated via REINFORCE to match the ground-truth feature moments.

    Paper: “Matching Features, Not Tokens: Energy-Based Fine-Tuning of Language Models” (Jelassi et al., 2026)

    Key advantages:

    diff --git a/docs/sequence_parallelism.html b/docs/sequence_parallelism.html index e998eea73..2a68acebc 100644 --- a/docs/sequence_parallelism.html +++ b/docs/sequence_parallelism.html @@ -177,6 +177,12 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true}); Quickstart +
  • + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +