add support for defined train split (#654)

This commit is contained in:
Wing Lian
2023-09-28 20:14:14 -04:00
committed by GitHub
parent 8662e8ffe8
commit 409ca0f21c
3 changed files with 61 additions and 0 deletions

View File

@@ -250,6 +250,10 @@ Have dataset(s) in one of the following format (JSONL recommended):
```json
{"article": "...", "question": "...", "answer": "..."}
```
- `context_qa.load_v2`: in context question answering (alternate)
```json
{"context": "...", "question": "...", "answer": "..."}
```
- `context_qa.load_404`: in context question answering from an article, with default response for no answer from context
```json
{"article": "...", "unanswerable_question": "..."}
@@ -356,6 +360,12 @@ See [examples](examples) for quick start. It is recommended to duplicate and mod
- path: data.jsonl # or json
ds_type: json # see other options below
type: alpaca
# dataset with splits, but no train split
dataset:
- path: knowrohit07/know_sql
type: context_qa.load_v2
train_on_split: validation
```
- loading