Bootstrap Hosted Axolotl Docs w/Quarto (#1429)

* precommit

* mv styes.css

* fix links
This commit is contained in:
Hamel Husain
2024-03-21 22:28:36 -07:00
committed by GitHub
parent 2a1589f6f6
commit 629450cecd
20 changed files with 187 additions and 34 deletions

28
.github/workflows/docs.yml vendored Normal file
View File

@@ -0,0 +1,28 @@
name: Publish Docs
on:
push:
branches:
- main
permissions:
contents: write
pages: write
jobs:
build-deploy:
runs-on: ubuntu-latest
steps:
- name: Check out repository
uses: actions/checkout@v4
- name: Set up Quarto
uses: quarto-dev/quarto-actions/setup@v2
- name: Setup Python
uses: actions/setup-python@v3
with:
python-version: '3.10'
- name: Publish to GitHub Pages (and render)
uses: quarto-dev/quarto-actions/publish@v2
with:
target: gh-pages
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

3
.gitignore vendored
View File

@@ -2,6 +2,7 @@
configs configs
last_run_prepared/ last_run_prepared/
.vscode .vscode
_site/
# Byte-compiled / optimized / DLL files # Byte-compiled / optimized / DLL files
__pycache__/ __pycache__/
@@ -172,3 +173,5 @@ wandb
lora-out/* lora-out/*
qlora-out/* qlora-out/*
mlruns/* mlruns/*
/.quarto/

View File

@@ -149,7 +149,7 @@ accelerate launch -m axolotl.cli.train https://raw.githubusercontent.com/OpenAcc
``` ```
>[!Tip] >[!Tip]
> If you want to debug axolotl or prefer to use Docker as your development environment, see the [debugging guide's section on Docker](docs/debugging.md#debugging-with-docker). > If you want to debug axolotl or prefer to use Docker as your development environment, see the [debugging guide's section on Docker](docs/debugging.qmd#debugging-with-docker).
<details> <details>
@@ -267,7 +267,7 @@ Use the below instead of the install method in QuickStart.
``` ```
pip3 install -e '.' pip3 install -e '.'
``` ```
More info: [mac.md](/docs/mac.md) More info: [mac.md](/docs/mac.qmd)
#### Launching on public clouds via SkyPilot #### Launching on public clouds via SkyPilot
To launch on GPU instances (both on-demand and spot instances) on 7+ clouds (GCP, AWS, Azure, OCI, and more), you can use [SkyPilot](https://skypilot.readthedocs.io/en/latest/index.html): To launch on GPU instances (both on-demand and spot instances) on 7+ clouds (GCP, AWS, Azure, OCI, and more), you can use [SkyPilot](https://skypilot.readthedocs.io/en/latest/index.html):
@@ -409,7 +409,7 @@ pretraining_dataset: # hf path only
{"segments": [{"label": true|false, "text": "..."}]} {"segments": [{"label": true|false, "text": "..."}]}
``` ```
This is a special format that allows you to construct prompts without using templates. This is for advanced users who want more freedom with prompt construction. See [these docs](docs/input_output.md) for more details. This is a special format that allows you to construct prompts without using templates. This is for advanced users who want more freedom with prompt construction. See [these docs](docs/input_output.qmd) for more details.
##### Conversation ##### Conversation
@@ -1125,7 +1125,7 @@ fsdp_config:
##### FSDP + QLoRA ##### FSDP + QLoRA
Axolotl supports training with FSDP and QLoRA, see [these docs](docs/fsdp_qlora.md) for more information. Axolotl supports training with FSDP and QLoRA, see [these docs](docs/fsdp_qlora.qmd) for more information.
##### Weights & Biases Logging ##### Weights & Biases Logging
@@ -1204,7 +1204,7 @@ although this will be very slow, and using the config options above are recommen
## Common Errors 🧰 ## Common Errors 🧰
See also the [FAQ's](./docs/faq.md) and [debugging guide](docs/debugging.md). See also the [FAQ's](./docs/faq.qmd) and [debugging guide](docs/debugging.qmd).
> If you encounter a 'Cuda out of memory' error, it means your GPU ran out of memory during the training process. Here's how to resolve it: > If you encounter a 'Cuda out of memory' error, it means your GPU ran out of memory during the training process. Here's how to resolve it:
@@ -1238,7 +1238,7 @@ It's safe to ignore it.
> NCCL Timeouts during training > NCCL Timeouts during training
See the [NCCL](docs/nccl.md) guide. See the [NCCL](docs/nccl.qmd) guide.
### Tokenization Mismatch b/w Inference & Training ### Tokenization Mismatch b/w Inference & Training
@@ -1256,7 +1256,7 @@ Having misalignment between your prompts during training and inference can cause
## Debugging Axolotl ## Debugging Axolotl
See [this debugging guide](docs/debugging.md) for tips on debugging Axolotl, along with an example configuration for debugging with VSCode. See [this debugging guide](docs/debugging.qmd) for tips on debugging Axolotl, along with an example configuration for debugging with VSCode.
## Need help? 🙋 ## Need help? 🙋

51
_quarto.yml Normal file
View File

@@ -0,0 +1,51 @@
project:
type: website
website:
title: "Axolotl"
description: "Fine-tuning"
favicon: favicon.jpg
navbar:
title: Axolotl
background: dark
pinned: false
collapse: false
tools:
- icon: twitter
href: https://twitter.com/axolotl_ai
- icon: github
href: https://github.com/OpenAccess-AI-Collective/axolotl/
- icon: discord
href: https://discord.gg/7m9sfhzaf3
sidebar:
pinned: true
collapse-level: 2
style: docked
contents:
- text: Home
href: index.qmd
- section: "How-To Guides"
contents:
# TODO Edit folder structure after we have more docs.
- docs/debugging.qmd
- docs/multipack.qmd
- docs/fdsp_qlora.qmd
- docs/input_output.qmd
- docs/rlhf.qmd
- docs/nccl.qmd
- docs/mac.qmd
- docs/multi-node.qmd
- section: "Reference"
contents:
- docs/config.qmd
- docs/faq.qmd
format:
html:
theme: materia
css: styles.css
toc: true

View File

@@ -1 +1 @@
This directory contains example config files that might be useful for debugging. Please see [docs/debugging.md](../docs/debugging.md) for more information. This directory contains example config files that might be useful for debugging. Please see [docs/debugging.qmd](../docs/debugging.qmd) for more information.

2
docs/.gitignore vendored Normal file
View File

@@ -0,0 +1,2 @@
/.quarto/
_site/

17
docs/config.qmd Normal file
View File

@@ -0,0 +1,17 @@
---
title: Config options
description: A complete list of all configuration options.
---
```{python}
#|echo: false
#|output: asis
import re
# Regex pattern to match the YAML block including its code fence
pattern = r'<details[^>]*id="all-yaml-options"[^>]*>.*?<summary>All yaml options.*?```yaml(.*?)```.*?</details>'
with open('../README.md', 'r') as f:
doc = f.read()
match = re.search(pattern, doc, re.DOTALL)
print("```yaml", match.group(1).strip(), "```", sep="\n")
```

View File

@@ -1,4 +1,8 @@
# Debugging Axolotl ---
title: Debugging
description: How to debug Axolotl
---
This document provides some tips and tricks for debugging Axolotl. It also provides an example configuration for debugging with VSCode. A good debugging setup is essential to understanding how Axolotl code works behind the scenes. This document provides some tips and tricks for debugging Axolotl. It also provides an example configuration for debugging with VSCode. A good debugging setup is essential to understanding how Axolotl code works behind the scenes.

View File

@@ -1,18 +0,0 @@
# Axolotl FAQ's
> The trainer stopped and hasn't progressed in several minutes.
Usually an issue with the GPU's communicating with each other. See the [NCCL doc](../docs/nccl.md)
> Exitcode -9
This usually happens when you run out of system RAM.
> Exitcode -7 while using deepspeed
Try upgrading deepspeed w: `pip install -U deepspeed`
> AttributeError: 'DummyOptim' object has no attribute 'step'
You may be using deepspeed with single gpu. Please don't set `deepspeed:` in yaml or cli.

21
docs/faq.qmd Normal file
View File

@@ -0,0 +1,21 @@
---
title: FAQ
description: Frequently asked questions
---
**Q: The trainer stopped and hasn't progressed in several minutes.**
> A: Usually an issue with the GPUs communicating with each other. See the [NCCL doc](nccl.qmd)
**Q: Exitcode -9**
> A: This usually happens when you run out of system RAM.
**Q: Exitcode -7 while using deepspeed**
> A: Try upgrading deepspeed w: `pip install -U deepspeed`
**Q: AttributeError: 'DummyOptim' object has no attribute 'step'**
> A: You may be using deepspeed with single gpu. Please don't set `deepspeed:` in yaml or cli.

View File

@@ -1,4 +1,10 @@
# FDSP + QLoRA ---
title: FDSP + QLoRA
description: Use FSDP with QLoRA to fine-tune large LLMs on consumer GPUs.
format:
html:
toc: true
---
## Background ## Background

View File

@@ -1,4 +1,7 @@
# Template-free prompt construction with the `input_output` format ---
title: Template-free prompt construction
description: "Template-free prompt construction with the `input_output` format"
---
<!-- TOC --> <!-- TOC -->

View File

@@ -1,8 +1,12 @@
# Mac M series support ---
title: Mac M-series
description: Mac M-series support
---
Currently Axolotl on Mac is partially usable, many of the dependencies of Axolotl including Pytorch do not support MPS or have incomplete support. Currently Axolotl on Mac is partially usable, many of the dependencies of Axolotl including Pytorch do not support MPS or have incomplete support.
Current support: Current support:
- [x] Support for all models - [x] Support for all models
- [x] Full training of models - [x] Full training of models
- [x] LoRA training - [x] LoRA training

View File

@@ -1,4 +1,7 @@
# Multi Node ---
title: Multi Node
description: How to use Axolotl on multiple machines
---
You will need to create a configuration for accelerate, either by using `accelerate config` and follow the instructions or you can use one of the preset below: You will need to create a configuration for accelerate, either by using `accelerate config` and follow the instructions or you can use one of the preset below:

View File

@@ -1,4 +1,7 @@
# Multipack (Sample Packing) ---
title: Multipack (Sample Packing)
description: Multipack is a technique to pack multiple sequences into a single batch to increase training throughput.
---
## Visualization of Multipack with Flash Attention ## Visualization of Multipack with Flash Attention

View File

@@ -1,4 +1,7 @@
# NCCL ---
title: NCCL
description: Troubleshooting NCCL issues
---
NVIDIA NCCL is a library to facilitate and optimize multi-GPU communication operations, such as broadcast, all-gather, reduce, all-reduce, etc. Broadly, NCCL configuration is highly environment-specific and is configured via several [environment variables](https://docs.nvidia.com/deeplearning/nccl/user-guide/docs/env.html). A common NCCL-related problem occurs when a long-running operation times out causing the training process to abort: NVIDIA NCCL is a library to facilitate and optimize multi-GPU communication operations, such as broadcast, all-gather, reduce, all-reduce, etc. Broadly, NCCL configuration is highly environment-specific and is configured via several [environment variables](https://docs.nvidia.com/deeplearning/nccl/user-guide/docs/env.html). A common NCCL-related problem occurs when a long-running operation times out causing the training process to abort:

View File

@@ -1,4 +1,7 @@
# RLHF (Beta) ---
title: "RLHF (Beta)"
description: "Reinforcement Learning from Human Feedback is a method whereby a language model is optimized from data using human feedback."
---
### Overview ### Overview

BIN
favicon.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 4.5 KiB

19
index.qmd Normal file
View File

@@ -0,0 +1,19 @@
```{python}
#|output: asis
#|echo: false
# This cell steals the README as the home page for now, but excludes the table of contents (quarto adds its own)
import re
pattern = re.compile(
r"<table>\s*<tr>\s*<td>\s*## Table of Contents.*?</td>\s*</tr>\s*</table>",
re.DOTALL | re.IGNORECASE
)
with open('README.md', 'r') as f:
txt = f.read()
cleaned = pattern.sub("", txt)
print(cleaned)
```

1
styles.css Normal file
View File

@@ -0,0 +1 @@
/* css styles */