Add Debugging Guide (#1089)

* add debug guide * add background * add .gitignore * Update devtools/dev_sharegpt.yml Co-authored-by: Wing Lian <wing.lian@gmail.com> * Update docs/debugging.md Co-authored-by: Wing Lian <wing.lian@gmail.com> * simplify example axolotl config * add additional comments * add video and TOC * try jsonc for better md rendering * style video thumbnail better * fix footnote --------- Co-authored-by: Wing Lian <wing.lian@gmail.com>
2024-01-10 20:49:24 -08:00
parent 78c5b1979e
commit 7512c3ad20
8 changed files with 285 additions and 1 deletions
--- a/README.md
+++ b/README.md
@@ -39,6 +39,7 @@ Features:
  - [Special Tokens](#special-tokens)
 - [Common Errors](#common-errors-)
  - [Tokenization Mismatch b/w Training & Inference](#tokenization-mismatch-bw-inference--training)
+- [Debugging Axolotl](#debugging-axolotl)
 - [Need Help?](#need-help-)
 - [Badge](#badge-)
 - [Community Showcase](#community-showcase)
@@ -1066,7 +1067,7 @@ although this will be very slow, and using the config options above are recommen

 ## Common Errors 🧰

-See also the [FAQ's](./docs/faq.md).
+See also the [FAQ's](./docs/faq.md) and [debugging guide](docs/debugging.md).

 > If you encounter a 'Cuda out of memory' error, it means your GPU ran out of memory during the training process. Here's how to resolve it:

@@ -1116,6 +1117,10 @@ If you decode a prompt constructed by axolotl, you might see spaces between toke

 Having misalignment between your prompts during training and inference can cause models to perform very poorly, so it is worth checking this.  See [this blog post](https://hamel.dev/notes/llm/05_tokenizer_gotchas.html) for a concrete example.

+## Debugging Axolotl
+
+See [this debugging guide](docs/debugging.md) for tips on debugging Axolotl, along with an example configuration for debugging with VSCode.
+
 ## Need help? 🙋♂️

 Join our [Discord server](https://discord.gg/HhrNrHJPRb) where we can help you