Remove redundant formats

This commit is contained in:
NanoCode012
2023-05-25 09:48:18 +09:00
parent 2a1b5728e6
commit e1a91b0918

View File

@@ -81,12 +81,8 @@ Have dataset(s) in one of the following format (JSONL recommended):
<details>
<summary>See all formats</summary>
<summary>See other formats</summary>
- `alpaca`: instruction; input(optional)
```json
{"instruction": "...", "input": "...", "output": "..."}
```
- `jeopardy`: question and answer
```json
{"question": "...", "category": "...", "answer": "..."}
@@ -103,14 +99,6 @@ Have dataset(s) in one of the following format (JSONL recommended):
```json
{"instruction": "...", "input": "...", "output": "...", "reflection": "...", "corrected": "..."}
```
- `sharegpt`: conversations
```json
{"conversations": [{"from": "...", "value": "..."}]}
```
- `completion`: raw corpus
```json
{"text": "..."}
```
> Have some new format to propose? Check if it's already defined in [data.py](src/axolotl/utils/data.py) in `dev` branch!