* feat(doc): organize docs, add to menu bar, fix broken formatting * feat: add link to custom integrations * feat: update readme for integrations to include citations and repo link * chore: update lm_eval info * chore: use fullname * Update docs/cli.qmd per suggestion Co-authored-by: Dan Saunders <danjsaund@gmail.com> * feat: add sweep doc * feat: add kd doc * fix: remove toc * fix: update deprecation * feat: add more info about chat_template issues * fix: heading level * fix: shell->bash code block * fix: ray link * fix(doc): heading level, header links, formatting * feat: add grpo docs * feat: add style changes * fix: wrong cli arg for lm-eval * fix: remove old run method * feat: load custom integration doc dynamically * fix: remove old cli way * fix: toc * fix: minor formatting --------- Co-authored-by: Dan Saunders <danjsaund@gmail.com>
49 lines
2.0 KiB
Plaintext
49 lines
2.0 KiB
Plaintext
---
|
|
title: FAQ
|
|
description: Frequently asked questions
|
|
---
|
|
|
|
### General
|
|
|
|
**Q: The trainer stopped and hasn't progressed in several minutes.**
|
|
|
|
> A: Usually an issue with the GPUs communicating with each other. See the [NCCL doc](nccl.qmd)
|
|
|
|
**Q: Exitcode -9**
|
|
|
|
> A: This usually happens when you run out of system RAM.
|
|
|
|
**Q: Exitcode -7 while using deepspeed**
|
|
|
|
> A: Try upgrading deepspeed w: `pip install -U deepspeed`
|
|
|
|
**Q: AttributeError: 'DummyOptim' object has no attribute 'step'**
|
|
|
|
> A: You may be using deepspeed with single gpu. Please don't set `deepspeed:` in yaml or cli.
|
|
|
|
**Q: The codes is stuck on saving preprocessed datasets.**
|
|
|
|
> A: This is usually an issue with the GPU. This can be resolved through setting the os environment variable `CUDA_VISIBLE_DEVICES=0`. If you are on runpod, this is usually a pod issue. Starting a new pod should take care of it.
|
|
|
|
### Chat templates
|
|
|
|
**Q: `jinja2.exceptions.UndefinedError: 'dict object' has no attribute 'content' / 'role' / ____`**
|
|
|
|
> A: This means that the property mapping for the stated attribute does not exist when building `chat_template` prompt. For example, if `no attribute 'content'`, please check you have added the correct mapping for `content` under `message_property_mappings`.
|
|
|
|
**Q: `Empty template generated for turn ___`**
|
|
|
|
> A: The `content` is empty for that turn.
|
|
|
|
**Q: `Could not find content start/end boundary for turn __`**
|
|
|
|
> A: The specific turn's start/end could not be detected. Please ensure you have set the `eos_token` following your `chat_template`. Otherwise, this could be a `chat_template` which doesn't use proper boundaries for each turn (like system). On the rare occurrence, make sure your content is not `[[dummy_message]]`. Please let us know about this.
|
|
|
|
**Q: `Content end boundary is before start boundary for turn ___`**
|
|
|
|
> A: This is an edge case which should not occur. Please create an Issue if this happens.
|
|
|
|
**Q: `Content end boundary is the same as start boundary for turn ___. This is likely an empty turn.`**
|
|
|
|
> A: This is likely an empty turn.
|