fix(vlm): handle legacy conversation data format and check image in data (#2018) [skip ci]

* fix: handle legacy conversation data format and check image in data * feat: add test for llama vision * feat: add max_steps to test * fix: incorrect indent and return preprocess * feat: use smaller model and dataset * chore: add extra config for sharegpt dataset
2024-12-03 12:01:31 +07:00
parent d56260c8d5
commit 2b7b4af81c
5 changed files with 239 additions and 11 deletions
--- a/tests/e2e/test_lora_llama.py
+++ b/tests/e2e/test_lora_llama.py
@@ -57,6 +57,7 @@ class TestLoraLlama(unittest.TestCase):
                "learning_rate": 0.00001,
                "optimizer": "adamw_torch",
                "lr_scheduler": "cosine",
+                "max_steps": 20,
            }
        )
        normalize_config(cfg)