diff --git a/.nojekyll b/.nojekyll index 04adbec1a..c2a64ff12 100644 --- a/.nojekyll +++ b/.nojekyll @@ -1 +1 @@ -f05ef313 \ No newline at end of file +17703de0 \ No newline at end of file diff --git a/FAQS.html b/FAQS.html index d81059f09..82918ae5b 100644 --- a/FAQS.html +++ b/FAQS.html @@ -141,6 +141,12 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true}); Quickstart + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
  • Instruction Dataset +

    For help choosing between these methods, see Choosing a Fine-Tuning Method.

    RLHF using Axolotl

    @@ -1310,7 +1341,7 @@ Tip
    -

    Check out our GRPO cookbook.

    +

    Check out our GRPO cookbook. For a comprehensive guide covering async training, custom rewards, importance sampling, and scaling, see the GRPO deep dive.

    In the latest GRPO implementation, vLLM is used to significantly speedup trajectory generation during training. In this example, we’re using 4 GPUs - 2 for training, and 2 for vLLM:

    @@ -1683,7 +1714,7 @@ Note CUDA_VISIBLE_DEVICES=0 axolotl vllm-serve config.yaml # Terminal 2: Train on GPUs 0,1 -CUDA_VISIBLE_DEVICES=0,1 accelerate launch --num_processes 2 -m axolotl.cli.train config.yaml
    +CUDA_VISIBLE_DEVICES=0,1 axolotl train config.yaml
    @@ -1823,6 +1854,19 @@ Tip

    EBFT

    +
    +
    +
    + +
    +
    +Tip +
    +
    +
    +

    For a detailed guide on EBFT modes, feature extraction, and configuration, see the EBFT guide.

    +
    +

    EBFT (Energy-Based Fine-Tuning) fine-tunes language models by optimizing a feature-matching loss rather than relying on external reward functions. A frozen copy of the model extracts embeddings from both generated and ground-truth completions, and the generator is updated via REINFORCE to match the ground-truth feature moments.

    Paper: “Matching Features, Not Tokens: Energy-Based Fine-Tuning of Language Models” (Jelassi et al., 2026)

    Key advantages:

    diff --git a/docs/sequence_parallelism.html b/docs/sequence_parallelism.html index e998eea73..2a68acebc 100644 --- a/docs/sequence_parallelism.html +++ b/docs/sequence_parallelism.html @@ -177,6 +177,12 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true}); Quickstart +
  • + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +