add dstack section (#1612) [skip ci]
* add dstack section * chore: lint --------- Co-authored-by: Wing Lian <wing.lian@gmail.com>
This commit is contained in:
37
README.md
37
README.md
@@ -34,6 +34,7 @@ Features:
|
|||||||
- [Mac](#mac)
|
- [Mac](#mac)
|
||||||
- [Google Colab](#google-colab)
|
- [Google Colab](#google-colab)
|
||||||
- [Launching on public clouds via SkyPilot](#launching-on-public-clouds-via-skypilot)
|
- [Launching on public clouds via SkyPilot](#launching-on-public-clouds-via-skypilot)
|
||||||
|
- [Launching on public clouds via dstack](#launching-on-public-clouds-via-dstack)
|
||||||
- [Dataset](#dataset)
|
- [Dataset](#dataset)
|
||||||
- [Config](#config)
|
- [Config](#config)
|
||||||
- [Train](#train)
|
- [Train](#train)
|
||||||
@@ -292,6 +293,42 @@ HF_TOKEN=xx sky launch axolotl.yaml --env HF_TOKEN
|
|||||||
HF_TOKEN=xx BUCKET=<unique-name> sky spot launch axolotl-spot.yaml --env HF_TOKEN --env BUCKET
|
HF_TOKEN=xx BUCKET=<unique-name> sky spot launch axolotl-spot.yaml --env HF_TOKEN --env BUCKET
|
||||||
```
|
```
|
||||||
|
|
||||||
|
#### Launching on public clouds via dstack
|
||||||
|
To launch on GPU instance (both on-demand and spot instances) on public clouds (GCP, AWS, Azure, Lambda Labs, TensorDock, Vast.ai, and CUDO), you can use [dstack](https://dstack.ai/).
|
||||||
|
|
||||||
|
Write a job description in YAML as below:
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
# dstack.yaml
|
||||||
|
type: task
|
||||||
|
|
||||||
|
image: winglian/axolotl-cloud:main-20240429-py3.11-cu121-2.2.1
|
||||||
|
|
||||||
|
env:
|
||||||
|
- HUGGING_FACE_HUB_TOKEN
|
||||||
|
- WANDB_API_KEY
|
||||||
|
|
||||||
|
commands:
|
||||||
|
- accelerate launch -m axolotl.cli.train config.yaml
|
||||||
|
|
||||||
|
ports:
|
||||||
|
- 6006
|
||||||
|
|
||||||
|
resources:
|
||||||
|
gpu:
|
||||||
|
memory: 24GB..
|
||||||
|
count: 2
|
||||||
|
```
|
||||||
|
|
||||||
|
then, simply run the job with `dstack run` command. Append `--spot` option if you want spot instance. `dstack run` command will show you the instance with cheapest price across multi cloud services:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
pip install dstack
|
||||||
|
HUGGING_FACE_HUB_TOKEN=xxx WANDB_API_KEY=xxx dstack run . -f dstack.yaml # --spot
|
||||||
|
```
|
||||||
|
|
||||||
|
For further and fine-grained use cases, please refer to the official [dstack documents](https://dstack.ai/docs/) and the detailed description of [axolotl example](https://github.com/dstackai/dstack/tree/master/examples/fine-tuning/axolotl) on the official repository.
|
||||||
|
|
||||||
### Dataset
|
### Dataset
|
||||||
|
|
||||||
Axolotl supports a variety of dataset formats. It is recommended to use a JSONL. The schema of the JSONL depends upon the task and the prompt template you wish to use. Instead of a JSONL, you can also use a HuggingFace dataset with columns for each JSONL field.
|
Axolotl supports a variety of dataset formats. It is recommended to use a JSONL. The schema of the JSONL depends upon the task and the prompt template you wish to use. Instead of a JSONL, you can also use a HuggingFace dataset with columns for each JSONL field.
|
||||||
|
|||||||
Reference in New Issue
Block a user