refactor README; hardcode links to quarto docs; add additional quarto doc pages (#2295)

* refactor README; hardcode links to quarto docs; add additional quarto doc pages * updates * review comments * update --------- Co-authored-by: Dan Saunders <dan@axolotl.ai>
2025-01-30 12:49:21 -05:00
parent 6f713226dd
commit 6f294c3d8d
10 changed files with 625 additions and 715 deletions
--- a/docs/dataset-formats/stepwise_supervised.qmd
+++ b/docs/dataset-formats/stepwise_supervised.qmd
@@ -6,8 +6,15 @@ order: 3

 ## Stepwise Supervised

-The stepwise supervised format is designed for chain-of-thought (COT) reasoning datasets where each example contains multiple completion steps and a preference label for each step.
-### ExampleHere's a simple example of a stepwise supervised dataset entry:```json
+The stepwise supervised format is designed for chain-of-thought (COT) reasoning
+datasets where each example contains multiple completion steps and a preference label
+for each step.
+
+### Example
+
+Here's a simple example of a stepwise supervised dataset entry:
+
+```json
 {
  "prompt": "Which number is larger, 9.8 or 9.11?",
  "completions": [
@@ -16,3 +23,4 @@ The stepwise supervised format is designed for chain-of-thought (COT) reasoning
  ],
  "labels": [true, false]
 }
+```