update doc and use P2P=LOC for brittle grpo test (#2649)

* update doc and skip brittle grpo test

* fix the path to run the multigpu tests

* increase timeout, use LOC instead of NVL

* typo

* use hf cache from s3 backed cloudfront

* mark grpo as flaky test dues to vllm start
This commit is contained in:
Wing Lian
2025-05-12 14:17:25 -04:00
committed by GitHub
parent c7b6790614
commit f34eef546a
6 changed files with 131 additions and 110 deletions

View File

@@ -3,7 +3,7 @@ name: docker-multigpu-tests-biweekly
on:
pull_request:
paths:
- 'tests/e2e/multigpu/*.py'
- 'tests/e2e/multigpu/**.py'
- 'requirements.txt'
- 'setup.py'
- 'pyproject.toml'