refactor README; hardcode links to quarto docs; add additional quarto doc pages (#2295)

* refactor README; hardcode links to quarto docs; add additional quarto doc pages

* updates

* review comments

* update

---------

Co-authored-by: Dan Saunders <dan@axolotl.ai>
This commit is contained in:
Dan Saunders
2025-01-30 12:49:21 -05:00
committed by GitHub
parent 6f713226dd
commit 6f294c3d8d
10 changed files with 625 additions and 715 deletions

View File

@@ -6,8 +6,15 @@ order: 3
## Stepwise Supervised
The stepwise supervised format is designed for chain-of-thought (COT) reasoning datasets where each example contains multiple completion steps and a preference label for each step.
### ExampleHere's a simple example of a stepwise supervised dataset entry:```json
The stepwise supervised format is designed for chain-of-thought (COT) reasoning
datasets where each example contains multiple completion steps and a preference label
for each step.
### Example
Here's a simple example of a stepwise supervised dataset entry:
```json
{
"prompt": "Which number is larger, 9.8 or 9.11?",
"completions": [
@@ -16,3 +23,4 @@ The stepwise supervised format is designed for chain-of-thought (COT) reasoning
],
"labels": [true, false]
}
```