fix(vlm): handle legacy conversation data format and check image in data (#2018) [skip ci]

* fix: handle legacy conversation data format and check image in data

* feat: add test for llama vision

* feat: add max_steps to test

* fix: incorrect indent and return preprocess

* feat: use smaller model and dataset

* chore: add extra config for sharegpt dataset
This commit is contained in:
NanoCode012
2024-12-03 12:01:31 +07:00
committed by bursteratom
parent d56260c8d5
commit 2b7b4af81c
5 changed files with 239 additions and 11 deletions

View File

@@ -57,6 +57,7 @@ class TestLoraLlama(unittest.TestCase):
"learning_rate": 0.00001,
"optimizer": "adamw_torch",
"lr_scheduler": "cosine",
"max_steps": 20,
}
)
normalize_config(cfg)