Run a callable ‘fn’ on all ranks and gather the results on the specified rank.
Args:
- fn (callable): A function that computes the value. This should not have any side effects.
@@ -544,7 +555,7 @@ The value is then broadcasted to all other ranks.
Run a callable ‘fn’ on all ranks and gather the results on the specified rank.
Args:
- fn (callable): A function that computes the value. This should not have any side effects.
@@ -555,18 +566,18 @@ The value is then broadcasted to all other ranks.
is_distributed
-
utils.distributed.is_distributed()
+
utils.distributed.is_distributed()
Check if distributed training is initialized.
is_main_process
-
utils.distributed.is_main_process()
-
Check if the current process is the main process.
-If not in distributed mode, always return True.
+
utils.distributed.is_main_process()
+
Check if the current process is the main process. If not in distributed mode,
+always return True.
reduce_and_broadcast
-
utils.distributed.reduce_and_broadcast(fn1, fn2)
+
utils.distributed.reduce_and_broadcast(fn1, fn2)
Run a callable ‘fn1’ on all ranks, gather the results, reduce them using ‘fn2’,
and then broadcast the reduced result to all ranks.
Args:
@@ -578,12 +589,12 @@ and then broadcast the reduced result to all ranks.
zero_first
-
utils.distributed.zero_first(is_main)
+
utils.distributed.zero_first(is_main)
runs the wrapped context so that rank 0 runs first before other ranks
zero_only
-
utils.distributed.zero_only()
+
utils.distributed.zero_only()
Context manager that only runs the enclosed block on the main rank.
diff --git a/search.json b/search.json
index af982f7a4..54db2e868 100644
--- a/search.json
+++ b/search.json
@@ -922,14 +922,14 @@
"href": "docs/api/utils.distributed.html",
"title": "utils.distributed",
"section": "",
- "text": "utils.distributed\nutility helpers for distributed checks\n\n\n\n\n\nName\nDescription\n\n\n\n\nbarrier\nActs as a barrier to wait for all processes. This ensures that all processes\n\n\ncompute_and_broadcast\nCompute a value using the function ‘fn’ only on the specified rank (default is 0).\n\n\ngather_from_all_ranks\nRun a callable ‘fn’ on all ranks and gather the results on the specified rank.\n\n\ngather_scalar_from_all_ranks\nRun a callable ‘fn’ on all ranks and gather the results on the specified rank.\n\n\nis_distributed\nCheck if distributed training is initialized.\n\n\nis_main_process\nCheck if the current process is the main process.\n\n\nreduce_and_broadcast\nRun a callable ‘fn1’ on all ranks, gather the results, reduce them using ‘fn2’,\n\n\nzero_first\nruns the wrapped context so that rank 0 runs first before other ranks\n\n\nzero_only\nContext manager that only runs the enclosed block on the main rank.\n\n\n\n\n\nutils.distributed.barrier()\nActs as a barrier to wait for all processes. This ensures that all processes\nreach the barrier before proceeding further.\n\n\n\nutils.distributed.compute_and_broadcast(fn)\nCompute a value using the function ‘fn’ only on the specified rank (default is 0).\nThe value is then broadcasted to all other ranks.\nArgs:\n- fn (callable): A function that computes the value. This should not have any side effects.\n- rank (int, optional): The rank that computes the value. Default is 0.\nReturns:\n- The computed value (int or float).\n\n\n\nutils.distributed.gather_from_all_ranks(fn, world_size=1)\nRun a callable ‘fn’ on all ranks and gather the results on the specified rank.\nArgs:\n- fn (callable): A function that computes the value. This should not have any side effects.\n- rank (int, optional): The rank that gathers the values. Default is 0.\n- world_size (int, optional): Total number of processes in the current distributed setup.\nReturns:\n- A list of computed values from all ranks if on the gathering rank, otherwise None.\n\n\n\nutils.distributed.gather_scalar_from_all_ranks(fn, world_size=1)\nRun a callable ‘fn’ on all ranks and gather the results on the specified rank.\nArgs:\n- fn (callable): A function that computes the value. This should not have any side effects.\n- rank (int, optional): The rank that gathers the values. Default is 0.\n- world_size (int, optional): Total number of processes in the current distributed setup.\nReturns:\n- A list of computed values from all ranks if on the gathering rank, otherwise None.\n\n\n\nutils.distributed.is_distributed()\nCheck if distributed training is initialized.\n\n\n\nutils.distributed.is_main_process()\nCheck if the current process is the main process.\nIf not in distributed mode, always return True.\n\n\n\nutils.distributed.reduce_and_broadcast(fn1, fn2)\nRun a callable ‘fn1’ on all ranks, gather the results, reduce them using ‘fn2’,\nand then broadcast the reduced result to all ranks.\nArgs:\n- fn1 (callable): A function that computes the value on each rank.\n- fn2 (callable): A reduction function that takes a list of values and returns a single value.\n- world_size (int, optional): Total number of processes in the current distributed setup.\nReturns:\n- The reduced and broadcasted value.\n\n\n\nutils.distributed.zero_first(is_main)\nruns the wrapped context so that rank 0 runs first before other ranks\n\n\n\nutils.distributed.zero_only()\nContext manager that only runs the enclosed block on the main rank."
+ "text": "utils.distributed\nutility helpers for distributed checks\n\n\n\n\n\nName\nDescription\n\n\n\n\nbarrier\nActs as a barrier to wait for all processes. This ensures that all processes\n\n\ncleanup_distributed\nDestroy process group if torch distributed is initialized. Called in training early\n\n\ncompute_and_broadcast\nCompute a value using the function ‘fn’ only on the specified rank (default is 0).\n\n\ngather_from_all_ranks\nRun a callable ‘fn’ on all ranks and gather the results on the specified rank.\n\n\ngather_scalar_from_all_ranks\nRun a callable ‘fn’ on all ranks and gather the results on the specified rank.\n\n\nis_distributed\nCheck if distributed training is initialized.\n\n\nis_main_process\nCheck if the current process is the main process. If not in distributed mode,\n\n\nreduce_and_broadcast\nRun a callable ‘fn1’ on all ranks, gather the results, reduce them using ‘fn2’,\n\n\nzero_first\nruns the wrapped context so that rank 0 runs first before other ranks\n\n\nzero_only\nContext manager that only runs the enclosed block on the main rank.\n\n\n\n\n\nutils.distributed.barrier()\nActs as a barrier to wait for all processes. This ensures that all processes\nreach the barrier before proceeding further.\n\n\n\nutils.distributed.cleanup_distributed()\nDestroy process group if torch distributed is initialized. Called in training early\ntermination or when training successfully completes.\n\n\n\nutils.distributed.compute_and_broadcast(fn)\nCompute a value using the function ‘fn’ only on the specified rank (default is 0).\nThe value is then broadcasted to all other ranks.\nArgs:\n- fn (callable): A function that computes the value. This should not have any side effects.\n- rank (int, optional): The rank that computes the value. Default is 0.\nReturns:\n- The computed value (int or float).\n\n\n\nutils.distributed.gather_from_all_ranks(fn, world_size=1)\nRun a callable ‘fn’ on all ranks and gather the results on the specified rank.\nArgs:\n- fn (callable): A function that computes the value. This should not have any side effects.\n- rank (int, optional): The rank that gathers the values. Default is 0.\n- world_size (int, optional): Total number of processes in the current distributed setup.\nReturns:\n- A list of computed values from all ranks if on the gathering rank, otherwise None.\n\n\n\nutils.distributed.gather_scalar_from_all_ranks(fn, world_size=1)\nRun a callable ‘fn’ on all ranks and gather the results on the specified rank.\nArgs:\n- fn (callable): A function that computes the value. This should not have any side effects.\n- rank (int, optional): The rank that gathers the values. Default is 0.\n- world_size (int, optional): Total number of processes in the current distributed setup.\nReturns:\n- A list of computed values from all ranks if on the gathering rank, otherwise None.\n\n\n\nutils.distributed.is_distributed()\nCheck if distributed training is initialized.\n\n\n\nutils.distributed.is_main_process()\nCheck if the current process is the main process. If not in distributed mode,\nalways return True.\n\n\n\nutils.distributed.reduce_and_broadcast(fn1, fn2)\nRun a callable ‘fn1’ on all ranks, gather the results, reduce them using ‘fn2’,\nand then broadcast the reduced result to all ranks.\nArgs:\n- fn1 (callable): A function that computes the value on each rank.\n- fn2 (callable): A reduction function that takes a list of values and returns a single value.\n- world_size (int, optional): Total number of processes in the current distributed setup.\nReturns:\n- The reduced and broadcasted value.\n\n\n\nutils.distributed.zero_first(is_main)\nruns the wrapped context so that rank 0 runs first before other ranks\n\n\n\nutils.distributed.zero_only()\nContext manager that only runs the enclosed block on the main rank."
},
{
"objectID": "docs/api/utils.distributed.html#functions",
"href": "docs/api/utils.distributed.html#functions",
"title": "utils.distributed",
"section": "",
- "text": "Name\nDescription\n\n\n\n\nbarrier\nActs as a barrier to wait for all processes. This ensures that all processes\n\n\ncompute_and_broadcast\nCompute a value using the function ‘fn’ only on the specified rank (default is 0).\n\n\ngather_from_all_ranks\nRun a callable ‘fn’ on all ranks and gather the results on the specified rank.\n\n\ngather_scalar_from_all_ranks\nRun a callable ‘fn’ on all ranks and gather the results on the specified rank.\n\n\nis_distributed\nCheck if distributed training is initialized.\n\n\nis_main_process\nCheck if the current process is the main process.\n\n\nreduce_and_broadcast\nRun a callable ‘fn1’ on all ranks, gather the results, reduce them using ‘fn2’,\n\n\nzero_first\nruns the wrapped context so that rank 0 runs first before other ranks\n\n\nzero_only\nContext manager that only runs the enclosed block on the main rank.\n\n\n\n\n\nutils.distributed.barrier()\nActs as a barrier to wait for all processes. This ensures that all processes\nreach the barrier before proceeding further.\n\n\n\nutils.distributed.compute_and_broadcast(fn)\nCompute a value using the function ‘fn’ only on the specified rank (default is 0).\nThe value is then broadcasted to all other ranks.\nArgs:\n- fn (callable): A function that computes the value. This should not have any side effects.\n- rank (int, optional): The rank that computes the value. Default is 0.\nReturns:\n- The computed value (int or float).\n\n\n\nutils.distributed.gather_from_all_ranks(fn, world_size=1)\nRun a callable ‘fn’ on all ranks and gather the results on the specified rank.\nArgs:\n- fn (callable): A function that computes the value. This should not have any side effects.\n- rank (int, optional): The rank that gathers the values. Default is 0.\n- world_size (int, optional): Total number of processes in the current distributed setup.\nReturns:\n- A list of computed values from all ranks if on the gathering rank, otherwise None.\n\n\n\nutils.distributed.gather_scalar_from_all_ranks(fn, world_size=1)\nRun a callable ‘fn’ on all ranks and gather the results on the specified rank.\nArgs:\n- fn (callable): A function that computes the value. This should not have any side effects.\n- rank (int, optional): The rank that gathers the values. Default is 0.\n- world_size (int, optional): Total number of processes in the current distributed setup.\nReturns:\n- A list of computed values from all ranks if on the gathering rank, otherwise None.\n\n\n\nutils.distributed.is_distributed()\nCheck if distributed training is initialized.\n\n\n\nutils.distributed.is_main_process()\nCheck if the current process is the main process.\nIf not in distributed mode, always return True.\n\n\n\nutils.distributed.reduce_and_broadcast(fn1, fn2)\nRun a callable ‘fn1’ on all ranks, gather the results, reduce them using ‘fn2’,\nand then broadcast the reduced result to all ranks.\nArgs:\n- fn1 (callable): A function that computes the value on each rank.\n- fn2 (callable): A reduction function that takes a list of values and returns a single value.\n- world_size (int, optional): Total number of processes in the current distributed setup.\nReturns:\n- The reduced and broadcasted value.\n\n\n\nutils.distributed.zero_first(is_main)\nruns the wrapped context so that rank 0 runs first before other ranks\n\n\n\nutils.distributed.zero_only()\nContext manager that only runs the enclosed block on the main rank."
+ "text": "Name\nDescription\n\n\n\n\nbarrier\nActs as a barrier to wait for all processes. This ensures that all processes\n\n\ncleanup_distributed\nDestroy process group if torch distributed is initialized. Called in training early\n\n\ncompute_and_broadcast\nCompute a value using the function ‘fn’ only on the specified rank (default is 0).\n\n\ngather_from_all_ranks\nRun a callable ‘fn’ on all ranks and gather the results on the specified rank.\n\n\ngather_scalar_from_all_ranks\nRun a callable ‘fn’ on all ranks and gather the results on the specified rank.\n\n\nis_distributed\nCheck if distributed training is initialized.\n\n\nis_main_process\nCheck if the current process is the main process. If not in distributed mode,\n\n\nreduce_and_broadcast\nRun a callable ‘fn1’ on all ranks, gather the results, reduce them using ‘fn2’,\n\n\nzero_first\nruns the wrapped context so that rank 0 runs first before other ranks\n\n\nzero_only\nContext manager that only runs the enclosed block on the main rank.\n\n\n\n\n\nutils.distributed.barrier()\nActs as a barrier to wait for all processes. This ensures that all processes\nreach the barrier before proceeding further.\n\n\n\nutils.distributed.cleanup_distributed()\nDestroy process group if torch distributed is initialized. Called in training early\ntermination or when training successfully completes.\n\n\n\nutils.distributed.compute_and_broadcast(fn)\nCompute a value using the function ‘fn’ only on the specified rank (default is 0).\nThe value is then broadcasted to all other ranks.\nArgs:\n- fn (callable): A function that computes the value. This should not have any side effects.\n- rank (int, optional): The rank that computes the value. Default is 0.\nReturns:\n- The computed value (int or float).\n\n\n\nutils.distributed.gather_from_all_ranks(fn, world_size=1)\nRun a callable ‘fn’ on all ranks and gather the results on the specified rank.\nArgs:\n- fn (callable): A function that computes the value. This should not have any side effects.\n- rank (int, optional): The rank that gathers the values. Default is 0.\n- world_size (int, optional): Total number of processes in the current distributed setup.\nReturns:\n- A list of computed values from all ranks if on the gathering rank, otherwise None.\n\n\n\nutils.distributed.gather_scalar_from_all_ranks(fn, world_size=1)\nRun a callable ‘fn’ on all ranks and gather the results on the specified rank.\nArgs:\n- fn (callable): A function that computes the value. This should not have any side effects.\n- rank (int, optional): The rank that gathers the values. Default is 0.\n- world_size (int, optional): Total number of processes in the current distributed setup.\nReturns:\n- A list of computed values from all ranks if on the gathering rank, otherwise None.\n\n\n\nutils.distributed.is_distributed()\nCheck if distributed training is initialized.\n\n\n\nutils.distributed.is_main_process()\nCheck if the current process is the main process. If not in distributed mode,\nalways return True.\n\n\n\nutils.distributed.reduce_and_broadcast(fn1, fn2)\nRun a callable ‘fn1’ on all ranks, gather the results, reduce them using ‘fn2’,\nand then broadcast the reduced result to all ranks.\nArgs:\n- fn1 (callable): A function that computes the value on each rank.\n- fn2 (callable): A reduction function that takes a list of values and returns a single value.\n- world_size (int, optional): Total number of processes in the current distributed setup.\nReturns:\n- The reduced and broadcasted value.\n\n\n\nutils.distributed.zero_first(is_main)\nruns the wrapped context so that rank 0 runs first before other ranks\n\n\n\nutils.distributed.zero_only()\nContext manager that only runs the enclosed block on the main rank."
},
{
"objectID": "docs/api/monkeypatch.utils.html",
diff --git a/sitemap.xml b/sitemap.xml
index f34779672..0b7b85ad2 100644
--- a/sitemap.xml
+++ b/sitemap.xml
@@ -2,674 +2,674 @@
https://axolotl-ai-cloud.github.io/axolotl/examples/colab-notebooks/colab-axolotl-example.html
- 2025-03-31T13:13:55.601Z
+ 2025-03-31T16:37:01.754Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/dataset-formats/stepwise_supervised.html
- 2025-03-31T13:13:55.597Z
+ 2025-03-31T16:37:01.750Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/dataset-formats/template_free.html
- 2025-03-31T13:13:55.597Z
+ 2025-03-31T16:37:01.750Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/dataset-formats/tokenized.html
- 2025-03-31T13:13:55.597Z
+ 2025-03-31T16:37:01.750Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/nccl.html
- 2025-03-31T13:13:55.600Z
+ 2025-03-31T16:37:01.753Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/amd_hpc.html
- 2025-03-31T13:13:55.596Z
+ 2025-03-31T16:37:01.749Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/config.html
- 2025-03-31T13:13:55.596Z
+ 2025-03-31T16:37:01.749Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/multi-gpu.html
- 2025-03-31T13:13:55.600Z
+ 2025-03-31T16:37:01.753Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/installation.html
- 2025-03-31T13:13:55.600Z
+ 2025-03-31T16:37:01.753Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/torchao.html
- 2025-03-31T13:13:55.600Z
+ 2025-03-31T16:37:01.753Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/reward_modelling.html
- 2025-03-31T13:13:55.600Z
+ 2025-03-31T16:37:01.753Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/input_output.html
- 2025-03-31T13:13:55.600Z
+ 2025-03-31T16:37:01.753Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/multimodal.html
- 2025-03-31T13:13:55.600Z
+ 2025-03-31T16:37:01.753Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.callbacks.mlflow_.html
- 2025-03-31T13:14:44.106Z
+ 2025-03-31T16:37:32.399Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/monkeypatch.trainer_fsdp_optim.html
- 2025-03-31T13:14:43.710Z
+ 2025-03-31T16:37:31.978Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/monkeypatch.data.batch_dataset_fetcher.html
- 2025-03-31T13:14:43.726Z
+ 2025-03-31T16:37:31.994Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/prompt_strategies.stepwise_supervised.html
- 2025-03-31T13:14:43.422Z
+ 2025-03-31T16:37:31.684Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/monkeypatch.mistral_attn_hijack_flash.html
- 2025-03-31T13:14:43.660Z
+ 2025-03-31T16:37:31.926Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/prompt_strategies.dpo.user_defined.html
- 2025-03-31T13:14:43.468Z
+ 2025-03-31T16:37:31.731Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/integrations.liger.args.html
- 2025-03-31T13:14:44.025Z
+ 2025-03-31T16:37:32.317Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.schemas.training.html
- 2025-03-31T13:14:43.891Z
+ 2025-03-31T16:37:32.180Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/datasets.html
- 2025-03-31T13:14:42.935Z
+ 2025-03-31T16:37:31.185Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/kernels.geglu.html
- 2025-03-31T13:14:43.601Z
+ 2025-03-31T16:37:31.866Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/monkeypatch.llama_attn_hijack_flash.html
- 2025-03-31T13:14:43.644Z
+ 2025-03-31T16:37:31.910Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/cli.sweeps.html
- 2025-03-31T13:14:43.259Z
+ 2025-03-31T16:37:31.518Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.freeze.html
- 2025-03-31T13:14:43.796Z
+ 2025-03-31T16:37:32.066Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/monkeypatch.multipack.html
- 2025-03-31T13:14:43.661Z
+ 2025-03-31T16:37:31.928Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/cli.main.html
- 2025-03-31T13:14:43.160Z
+ 2025-03-31T16:37:31.416Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/core.trainers.trl.html
- 2025-03-31T13:14:43.333Z
+ 2025-03-31T16:37:31.593Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/prompt_strategies.dpo.passthrough.html
- 2025-03-31T13:14:43.470Z
+ 2025-03-31T16:37:31.732Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/core.chat.format.llama3x.html
- 2025-03-31T13:14:43.116Z
+ 2025-03-31T16:37:31.371Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/core.datasets.transforms.chat_builder.html
- 2025-03-31T13:14:43.130Z
+ 2025-03-31T16:37:31.385Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/prompt_strategies.kto.user_defined.html
- 2025-03-31T13:14:43.487Z
+ 2025-03-31T16:37:31.749Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.collators.mamba.html
- 2025-03-31T13:14:44.081Z
+ 2025-03-31T16:37:32.374Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/integrations.base.html
- 2025-03-31T13:14:44.010Z
+ 2025-03-31T16:37:32.302Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.bench.html
- 2025-03-31T13:14:43.788Z
+ 2025-03-31T16:37:32.058Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/kernels.swiglu.html
- 2025-03-31T13:14:43.611Z
+ 2025-03-31T16:37:31.876Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/core.chat.format.shared.html
- 2025-03-31T13:14:43.118Z
+ 2025-03-31T16:37:31.372Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/integrations.cut_cross_entropy.args.html
- 2025-03-31T13:14:44.013Z
+ 2025-03-31T16:37:32.305Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/core.datasets.chat.html
- 2025-03-31T13:14:43.123Z
+ 2025-03-31T16:37:31.377Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.callbacks.lisa.html
- 2025-03-31T13:14:44.102Z
+ 2025-03-31T16:37:32.395Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/integrations.grokfast.optimizer.html
- 2025-03-31T13:14:44.014Z
+ 2025-03-31T16:37:32.306Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/prompt_strategies.alpaca_chat.html
- 2025-03-31T13:14:43.372Z
+ 2025-03-31T16:37:31.634Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/prompt_strategies.alpaca_instruct.html
- 2025-03-31T13:14:43.374Z
+ 2025-03-31T16:37:31.635Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/prompt_strategies.kto.chatml.html
- 2025-03-31T13:14:43.486Z
+ 2025-03-31T16:37:31.748Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.schemas.integrations.html
- 2025-03-31T13:14:43.936Z
+ 2025-03-31T16:37:32.227Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.schemas.trl.html
- 2025-03-31T13:14:43.919Z
+ 2025-03-31T16:37:32.210Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/prompt_tokenizers.html
- 2025-03-31T13:14:42.989Z
+ 2025-03-31T16:37:31.240Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.data.sft.html
- 2025-03-31T13:14:43.868Z
+ 2025-03-31T16:37:32.150Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.schedulers.html
- 2025-03-31T13:14:43.836Z
+ 2025-03-31T16:37:32.108Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.chat_templates.html
- 2025-03-31T13:14:43.772Z
+ 2025-03-31T16:37:32.041Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.models.html
- 2025-03-31T13:14:43.756Z
+ 2025-03-31T16:37:32.025Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/prompt_strategies.dpo.chatml.html
- 2025-03-31T13:14:43.465Z
+ 2025-03-31T16:37:31.728Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.distributed.html
- 2025-03-31T13:14:43.855Z
+ 2025-03-31T16:37:32.129Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/monkeypatch.utils.html
- 2025-03-31T13:14:43.699Z
+ 2025-03-31T16:37:31.966Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.schemas.utils.html
- 2025-03-31T13:14:43.948Z
+ 2025-03-31T16:37:32.239Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/monkeypatch.llama_expand_mask.html
- 2025-03-31T13:14:43.669Z
+ 2025-03-31T16:37:31.936Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/common.datasets.html
- 2025-03-31T13:14:44.050Z
+ 2025-03-31T16:37:32.342Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/logging_config.html
- 2025-03-31T13:14:42.994Z
+ 2025-03-31T16:37:31.245Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/kernels.quantize.html
- 2025-03-31T13:14:43.618Z
+ 2025-03-31T16:37:31.883Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/monkeypatch.llama_patch_multipack.html
- 2025-03-31T13:14:43.702Z
+ 2025-03-31T16:37:31.969Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.callbacks.comet_.html
- 2025-03-31T13:14:44.109Z
+ 2025-03-31T16:37:32.403Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.trainer.html
- 2025-03-31T13:14:43.813Z
+ 2025-03-31T16:37:32.083Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/common.architectures.html
- 2025-03-31T13:14:44.033Z
+ 2025-03-31T16:37:32.325Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/models.mamba.modeling_mamba.html
- 2025-03-31T13:14:44.051Z
+ 2025-03-31T16:37:32.343Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/integrations.spectrum.args.html
- 2025-03-31T13:14:44.031Z
+ 2025-03-31T16:37:32.323Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/cli.merge_sharded_fsdp_weights.html
- 2025-03-31T13:14:43.245Z
+ 2025-03-31T16:37:31.504Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/prompt_strategies.bradley_terry.llama3.html
- 2025-03-31T13:14:43.511Z
+ 2025-03-31T16:37:31.773Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/cli.merge_lora.html
- 2025-03-31T13:14:43.234Z
+ 2025-03-31T16:37:31.492Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.lora.html
- 2025-03-31T13:14:43.777Z
+ 2025-03-31T16:37:32.046Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/monkeypatch.relora.html
- 2025-03-31T13:14:43.668Z
+ 2025-03-31T16:37:31.935Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/cli.cloud.base.html
- 2025-03-31T13:14:43.293Z
+ 2025-03-31T16:37:31.553Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/common.const.html
- 2025-03-31T13:14:44.034Z
+ 2025-03-31T16:37:32.326Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/convert.html
- 2025-03-31T13:14:42.948Z
+ 2025-03-31T16:37:31.198Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/prompt_strategies.chat_template.html
- 2025-03-31T13:14:43.359Z
+ 2025-03-31T16:37:31.620Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/kernels.utils.html
- 2025-03-31T13:14:43.619Z
+ 2025-03-31T16:37:31.885Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.lora_embeddings.html
- 2025-03-31T13:14:43.780Z
+ 2025-03-31T16:37:32.049Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/lora_optims.html
- 2025-03-31T13:13:55.600Z
+ 2025-03-31T16:37:01.753Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/batch_vs_grad.html
- 2025-03-31T13:13:55.596Z
+ 2025-03-31T16:37:01.749Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/faq.html
- 2025-03-31T13:13:55.597Z
+ 2025-03-31T16:37:01.750Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/debugging.html
- 2025-03-31T13:13:55.597Z
+ 2025-03-31T16:37:01.750Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/lr_groups.html
- 2025-03-31T13:13:55.600Z
+ 2025-03-31T16:37:01.753Zhttps://axolotl-ai-cloud.github.io/axolotl/TODO.html
- 2025-03-31T13:13:55.595Z
+ 2025-03-31T16:37:01.748Zhttps://axolotl-ai-cloud.github.io/axolotl/src/axolotl/integrations/LICENSE.html
- 2025-03-31T13:13:55.616Z
+ 2025-03-31T16:37:01.769Zhttps://axolotl-ai-cloud.github.io/axolotl/index.html
- 2025-03-31T13:13:55.612Z
+ 2025-03-31T16:37:01.765Zhttps://axolotl-ai-cloud.github.io/axolotl/src/axolotl/integrations/cut_cross_entropy/ACKNOWLEDGEMENTS.html
- 2025-03-31T13:13:55.616Z
+ 2025-03-31T16:37:01.769Zhttps://axolotl-ai-cloud.github.io/axolotl/FAQS.html
- 2025-03-31T13:13:55.595Z
+ 2025-03-31T16:37:01.748Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/multi-node.html
- 2025-03-31T13:13:55.600Z
+ 2025-03-31T16:37:01.753Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/sequence_parallelism.html
- 2025-03-31T13:13:55.600Z
+ 2025-03-31T16:37:01.753Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/multipack.html
- 2025-03-31T13:13:55.600Z
+ 2025-03-31T16:37:01.753Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/inference.html
- 2025-03-31T13:13:55.600Z
+ 2025-03-31T16:37:01.753Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/getting-started.html
- 2025-03-31T13:13:55.597Z
+ 2025-03-31T16:37:01.750Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.callbacks.perplexity.html
- 2025-03-31T13:14:44.097Z
+ 2025-03-31T16:37:32.390Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/core.trainer_builder.html
- 2025-03-31T13:14:43.009Z
+ 2025-03-31T16:37:31.260Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/cli.train.html
- 2025-03-31T13:14:43.168Z
+ 2025-03-31T16:37:31.425Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/prompt_strategies.dpo.llama3.html
- 2025-03-31T13:14:43.455Z
+ 2025-03-31T16:37:31.718Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/cli.cloud.modal_.html
- 2025-03-31T13:14:43.299Z
+ 2025-03-31T16:37:31.559Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/index.html
- 2025-03-31T13:14:42.857Z
+ 2025-03-31T16:37:31.106Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/prompt_strategies.input_output.html
- 2025-03-31T13:14:43.418Z
+ 2025-03-31T16:37:31.680Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.optimizers.adopt.html
- 2025-03-31T13:14:43.865Z
+ 2025-03-31T16:37:32.145Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/monkeypatch.btlm_attn_hijack_flash.html
- 2025-03-31T13:14:43.700Z
+ 2025-03-31T16:37:31.968Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.collators.core.html
- 2025-03-31T13:14:44.052Z
+ 2025-03-31T16:37:32.345Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.schemas.datasets.html
- 2025-03-31T13:14:43.908Z
+ 2025-03-31T16:37:32.198Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/integrations.kd.trainer.html
- 2025-03-31T13:14:44.022Z
+ 2025-03-31T16:37:32.313Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.tokenization.html
- 2025-03-31T13:14:43.762Z
+ 2025-03-31T16:37:32.031Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/monkeypatch.mixtral.html
- 2025-03-31T13:14:43.727Z
+ 2025-03-31T16:37:31.996Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/monkeypatch.stablelm_attn_hijack_flash.html
- 2025-03-31T13:14:43.707Z
+ 2025-03-31T16:37:31.975Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.schemas.model.html
- 2025-03-31T13:14:43.886Z
+ 2025-03-31T16:37:32.175Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.schemas.multimodal.html
- 2025-03-31T13:14:43.924Z
+ 2025-03-31T16:37:32.215Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.gradient_checkpointing.unsloth.html
- 2025-03-31T13:14:43.871Z
+ 2025-03-31T16:37:32.155Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/core.trainers.base.html
- 2025-03-31T13:14:43.316Z
+ 2025-03-31T16:37:31.576Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/monkeypatch.unsloth_.html
- 2025-03-31T13:14:43.718Z
+ 2025-03-31T16:37:31.986Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.samplers.multipack.html
- 2025-03-31T13:14:44.091Z
+ 2025-03-31T16:37:32.384Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.callbacks.profiler.html
- 2025-03-31T13:14:44.101Z
+ 2025-03-31T16:37:32.394Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/integrations.lm_eval.args.html
- 2025-03-31T13:14:44.028Z
+ 2025-03-31T16:37:32.320Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.data.pretraining.html
- 2025-03-31T13:14:43.867Z
+ 2025-03-31T16:37:32.147Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/evaluate.html
- 2025-03-31T13:14:42.927Z
+ 2025-03-31T16:37:31.177Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.dict.html
- 2025-03-31T13:14:43.858Z
+ 2025-03-31T16:37:32.132Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/cli.utils.html
- 2025-03-31T13:14:43.290Z
+ 2025-03-31T16:37:31.550Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/prompt_strategies.pygmalion.html
- 2025-03-31T13:14:43.440Z
+ 2025-03-31T16:37:31.702Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/core.training_args.html
- 2025-03-31T13:14:43.091Z
+ 2025-03-31T16:37:31.345Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/cli.inference.html
- 2025-03-31T13:14:43.226Z
+ 2025-03-31T16:37:31.484Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/kernels.lora.html
- 2025-03-31T13:14:43.590Z
+ 2025-03-31T16:37:31.855Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/cli.evaluate.html
- 2025-03-31T13:14:43.176Z
+ 2025-03-31T16:37:31.433Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.collators.batching.html
- 2025-03-31T13:14:44.078Z
+ 2025-03-31T16:37:32.371Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/prompt_strategies.completion.html
- 2025-03-31T13:14:43.412Z
+ 2025-03-31T16:37:31.674Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/prompt_strategies.dpo.zephyr.html
- 2025-03-31T13:14:43.467Z
+ 2025-03-31T16:37:31.729Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/prompt_strategies.metharme.html
- 2025-03-31T13:14:43.429Z
+ 2025-03-31T16:37:31.691Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/prompt_strategies.orpo.chat_template.html
- 2025-03-31T13:14:43.507Z
+ 2025-03-31T16:37:31.770Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/prompt_strategies.alpaca_w_system.html
- 2025-03-31T13:14:43.385Z
+ 2025-03-31T16:37:31.647Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.model_shard_quant.html
- 2025-03-31T13:14:43.785Z
+ 2025-03-31T16:37:32.055Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/cli.config.html
- 2025-03-31T13:14:43.212Z
+ 2025-03-31T16:37:31.470Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.schemas.enums.html
- 2025-03-31T13:14:43.943Z
+ 2025-03-31T16:37:32.233Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/cli.preprocess.html
- 2025-03-31T13:14:43.253Z
+ 2025-03-31T16:37:31.512Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/core.chat.messages.html
- 2025-03-31T13:14:43.113Z
+ 2025-03-31T16:37:31.368Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/prompt_strategies.dpo.chat_template.html
- 2025-03-31T13:14:43.445Z
+ 2025-03-31T16:37:31.707Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.schemas.peft.html
- 2025-03-31T13:14:43.916Z
+ 2025-03-31T16:37:32.206Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/train.html
- 2025-03-31T13:14:42.917Z
+ 2025-03-31T16:37:31.167Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/prompt_strategies.messages.chat.html
- 2025-03-31T13:14:43.444Z
+ 2025-03-31T16:37:31.706Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/prompt_strategies.orcamini.html
- 2025-03-31T13:14:43.433Z
+ 2025-03-31T16:37:31.695Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.collators.mm_chat.html
- 2025-03-31T13:14:44.086Z
+ 2025-03-31T16:37:32.379Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/prompt_strategies.kto.llama3.html
- 2025-03-31T13:14:43.478Z
+ 2025-03-31T16:37:31.740Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/monkeypatch.attention.mllama.html
- 2025-03-31T13:14:43.724Z
+ 2025-03-31T16:37:31.993Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/cli.checks.html
- 2025-03-31T13:14:43.195Z
+ 2025-03-31T16:37:31.452Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/monkeypatch.transformers_fa_utils.html
- 2025-03-31T13:14:43.716Z
+ 2025-03-31T16:37:31.984Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/monkeypatch.llama_attn_hijack_xformers.html
- 2025-03-31T13:14:43.646Z
+ 2025-03-31T16:37:31.912Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/core.trainers.dpo.trainer.html
- 2025-03-31T13:14:43.339Z
+ 2025-03-31T16:37:31.600Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/prompt_strategies.user_defined.html
- 2025-03-31T13:14:43.393Z
+ 2025-03-31T16:37:31.655Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/cli.args.html
- 2025-03-31T13:14:43.189Z
+ 2025-03-31T16:37:31.446Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/prompt_strategies.llama2_chat.html
- 2025-03-31T13:14:43.406Z
+ 2025-03-31T16:37:31.668Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.schemas.config.html
- 2025-03-31T13:14:43.879Z
+ 2025-03-31T16:37:32.168Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/core.trainers.grpo.trainer.html
- 2025-03-31T13:14:43.343Z
+ 2025-03-31T16:37:31.604Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/core.chat.format.chatml.html
- 2025-03-31T13:14:43.115Z
+ 2025-03-31T16:37:31.369Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/monkeypatch.lora_kernels.html
- 2025-03-31T13:14:43.691Z
+ 2025-03-31T16:37:31.958Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/prompt_strategies.base.html
- 2025-03-31T13:14:43.344Z
+ 2025-03-31T16:37:31.605Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/rlhf.html
- 2025-03-31T13:13:55.600Z
+ 2025-03-31T16:37:01.753Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/cli.html
- 2025-03-31T13:13:55.596Z
+ 2025-03-31T16:37:01.749Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/unsloth.html
- 2025-03-31T13:13:55.600Z
+ 2025-03-31T16:37:01.753Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/fsdp_qlora.html
- 2025-03-31T13:13:55.597Z
+ 2025-03-31T16:37:01.750Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/dataset_preprocessing.html
- 2025-03-31T13:13:55.597Z
+ 2025-03-31T16:37:01.750Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/custom_integrations.html
- 2025-03-31T13:13:55.596Z
+ 2025-03-31T16:37:01.749Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/mac.html
- 2025-03-31T13:13:55.600Z
+ 2025-03-31T16:37:01.753Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/docker.html
- 2025-03-31T13:13:55.597Z
+ 2025-03-31T16:37:01.750Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/ray-integration.html
- 2025-03-31T13:13:55.600Z
+ 2025-03-31T16:37:01.753Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/dataset-formats/index.html
- 2025-03-31T13:13:55.597Z
+ 2025-03-31T16:37:01.750Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/dataset-formats/conversation.html
- 2025-03-31T13:13:55.596Z
+ 2025-03-31T16:37:01.750Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/dataset-formats/pretraining.html
- 2025-03-31T13:13:55.597Z
+ 2025-03-31T16:37:01.750Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/dataset-formats/inst_tune.html
- 2025-03-31T13:13:55.597Z
+ 2025-03-31T16:37:01.750Z