Various fixes for CI, save_only_model for RL, prevent packing multiprocessing deadlocks (#2661)
* lean mistral ft tests, remove e2e torch 2.4.1 test * make sure to pass save_only_model for RL * more tests to make ci leaner, add cleanup to modal ci * fix module for import in e2e tests * use mp spawn to prevent deadlocks with packing * make sure cleanup shell script is executable when cloned out
This commit is contained in:
19
cicd/cleanup.py
Normal file
19
cicd/cleanup.py
Normal file
@@ -0,0 +1,19 @@
|
||||
"""Modal app to run axolotl GPU cleanup"""
|
||||
|
||||
from .single_gpu import VOLUME_CONFIG, app, cicd_image, run_cmd
|
||||
|
||||
|
||||
@app.function(
|
||||
image=cicd_image,
|
||||
timeout=60 * 60,
|
||||
cpu=8.0,
|
||||
memory=131072,
|
||||
volumes=VOLUME_CONFIG,
|
||||
)
|
||||
def cleanup():
|
||||
run_cmd("./cicd/cleanup.sh", "/workspace/axolotl")
|
||||
|
||||
|
||||
@app.local_entrypoint()
|
||||
def main():
|
||||
cleanup.remote()
|
||||
Reference in New Issue
Block a user