axolotl/docs at fix/granite-speech - axolotl - Gitea

tocmo0nlord/axolotl

Files

History

Wing Lian 99187cd208 Activation Offloading w CUDA Streams (#2900 ) [skip ci]

* use cuda streams for activation offloading

* use torch native ops

* update cfg schema for streams

* fix literal constructor for set

* use context for training step so it doesn't affect evals

* disable streams

* auto gc on eval steps

* use activation_offloading config arg

* add docs for gradient checkpointing

* handle validation for gc/ao

* use cuda streams for act offloading

* add more validation for AC w/o GC

* fix docs

* move activation_offloading lower in definition so it doesn't break args/kwargs

* fix kd due to import order

2025-07-14 20:10:20 -04:00

..

dataset-formats

feat: add devstral small 2507 (#2896 )

2025-07-11 09:34:19 +07:00

Ray Train Axolotl Integration (#2251 )

2025-01-29 00:10:19 -05:00

Config doc autogen (#2718 )

2025-06-18 15:36:53 -04:00

.gitignore

Config doc autogen (#2718 )

2025-06-18 15:36:53 -04:00

amd_hpc.qmd

Feat(doc): Reorganize documentation, fix broken syntax, update notes (#2348 )

2025-02-25 16:09:37 +07:00

batch_vs_grad.qmd

Feat: update doc (#1475 ) [skip ci]

2024-04-04 13:43:40 +09:00

cli.qmd

QAT (#2590 )

2025-05-28 12:35:47 +01:00

custom_integrations.qmd

densemixer plugin integration (#2868 )

2025-07-07 17:05:19 -04:00

dataset_loading.qmd

Config doc autogen (#2718 )

2025-06-18 15:36:53 -04:00

dataset_preprocessing.qmd

Autodoc generation with quartodoc (#2419 )

2025-03-21 12:26:47 -04:00

debugging.qmd

Feat(doc): Reorganize documentation, fix broken syntax, update notes (#2348 )

2025-02-25 16:09:37 +07:00

docker.qmd

feat(doc): re-add docker 2.7.0 tag back (#2902 ) [skip ci]

2025-07-12 11:40:01 -04:00

faq.qmd

feat(doc): add vllm and fa2 incompat error to faq (#2877 )

2025-07-07 14:13:37 -04:00

fsdp_qlora.qmd

Fix link in FSDP + QLoRA docs. (#2879 ) [skip ci]

2025-07-08 09:19:09 -04:00

getting-started.qmd

Config doc autogen (#2718 )

2025-06-18 15:36:53 -04:00

gradient_checkpointing.qmd

Activation Offloading w CUDA Streams (#2900 ) [skip ci]

2025-07-14 20:10:20 -04:00

inference.qmd

Feat: minor docs improvements for RLHF and faq on embeddings (#2401 ) [skip ci]

2025-03-17 08:39:04 -04:00

input_output.qmd

Feat(doc): Reorganize documentation, fix broken syntax, update notes (#2348 )

2025-02-25 16:09:37 +07:00

installation.qmd

release v0.11.0 (#2875 )

2025-07-09 09:22:35 -04:00

lora_optims.qmd

feat(doc): note lora kernel incompat with RLHF (#2706 ) [skip ci]

2025-05-28 15:48:40 +07:00

lr_groups.qmd

support for custom lr groups for non-embedding modules (#2213 )

2025-01-24 12:56:28 -05:00

mac.qmd

Feat(doc): Reorganize documentation, fix broken syntax, update notes (#2348 )

2025-02-25 16:09:37 +07:00

multi-gpu.qmd

FSDP1 -> FSDP2 (#2760 )

2025-07-12 15:18:01 +01:00

multi-node.qmd

FSDP1 -> FSDP2 (#2760 )

2025-07-12 15:18:01 +01:00

multimodal.qmd

bump hf deps (#2735 ) [skip ci]

2025-06-05 07:20:33 -07:00

multipack.qmd

Bootstrap Hosted Axolotl Docs w/Quarto (#1429 )

2024-03-21 22:28:36 -07:00

nccl.qmd

Feat(doc): Reorganize documentation, fix broken syntax, update notes (#2348 )

2025-02-25 16:09:37 +07:00

qat.qmd

QAT docfix (#2778 ) [skip ci]

2025-06-12 13:22:40 -04:00

quantize.qmd

Config doc autogen (#2718 )

2025-06-18 15:36:53 -04:00

ray-integration.qmd

Feat(doc): Reorganize documentation, fix broken syntax, update notes (#2348 )

2025-02-25 16:09:37 +07:00

reward_modelling.qmd

chore(docs): add cookbook/blog link to docs (#2410 ) [skip ci]

2025-03-17 08:38:19 -04:00

rlhf.qmd

fix: customized dataset with simpo (#2894 ) [skip ci]

2025-07-12 11:40:30 -04:00

sequence_parallelism.qmd

SP dataloader patching + removing custom sampler / dataloader logic (#2686 )

2025-05-21 11:20:20 -04:00

torchao.qmd

Feat(doc): Reorganize documentation, fix broken syntax, update notes (#2348 )

2025-02-25 16:09:37 +07:00

unsloth.qmd

Feat(doc): Reorganize documentation, fix broken syntax, update notes (#2348 )

2025-02-25 16:09:37 +07:00