What is dstack?
dstack is a lightweight and open-source command-line tool to provision infrastructure for ML workflows.
- Define your ML workflows declaratively, incl. their dependencies, environment, and required compute resources
- Run workflows via the
dstackCLI. Have infrastructure provisioned automatically in a configured cloud account.
- Save output artifacts, such as data and models, and reuse them in other ML workflows
dstackto process data, train models, host apps, and launch dev environments
How does it work?
dstacklocally (a simple
pip installwill do)
- Configure the cloud credentials locally (e.g. via
dstack configto configure the cloud region (to provision infrastructure) and the S3 bucket (to store data)
- Define ML workflows in
.dstack/workflows.yaml(within your existing Git repository)
- Run ML workflows via the
dstack runCLI command. Use other CLI commands to show status, manage state, artifacts, etc.
When you run an ML workflow via the
dstack CLI, it provisions the required compute resources (in a configured cloud
account), sets up environment (such as Python, Conda, CUDA, etc), fetches your code, downloads deps,
saves artifacts, and tears down compute resources.