Files
axolotl/docs/faq.md
2023-10-29 23:06:06 +09:00

497 B

Axolotl FAQ's

The trainer stopped and hasn't progressed in several minutes.

Usually an issue with the GPU's communicating with each other. See the NCCL doc

Exitcode -9

This usually happens when you run out of system RAM.

Exitcode -7 while using deepspeed

Try upgrading deepspeed w: pip install -U deepspeed

AttributeError: 'DummyOptim' object has no attribute 'step'

You may be using deepspeed with single gpu. Please don't set deepspeed: in yaml or cli.