gradient accumulation tests, embeddings w pad_token fix, smaller models (#2059)

* add more test cases for gradient accumulation and fix zero3

* swap out for smaller model

* fix missing return

* fix missing pad_token in config

* support concurrency for multigpu testing

* cast empty deepspeed to empty string for zero3 check

* fix temp_dir as fixture so parametrize works properly

* fix test file for multigpu evals

* don't use default

* don't use default for fsdp_state_dict_type

* don't use llama tokenizer w smollm

* also automatically cancel multigpu for concurrency
This commit is contained in:
Wing Lian
2024-11-14 12:59:00 -05:00
committed by GitHub
parent f3a5d119af
commit 71d4030b79
8 changed files with 118 additions and 71 deletions

16
tests/e2e/conftest.py Normal file
View File

@@ -0,0 +1,16 @@
"""
shared pytest fixtures
"""
import shutil
import tempfile
import pytest
@pytest.fixture
def temp_dir():
# Create a temporary directory
_temp_dir = tempfile.mkdtemp()
yield _temp_dir
# Clean up the directory after the test
shutil.rmtree(_temp_dir)