Reproducible ML workflows for teams¶
Introduction to dstack¶
dstack allows you define ML workflows as code, and run then in a configured cloud via the CLI.
It automatically handles workflow dependencies, provisions cloud infrastructure, and versions
data, models, and environments.
- GitOps-driven: Define your ML workflows via YAML, and run them in a configured cloud using the CLI, — interactively from the IDE or from your CI/CD pipeline.
- Collaborative: Version data, models, and environments, and reuse them easily in other workflows — across different projects and teams.
- Cloud-native: Run workflows locally or in a configured cloud. Configure the resources required by workflows (memory, GPU, etc.) as code.
- Vendor-agnostic: Use any cloud provider, languages, frameworks, tools, and third-party services. No code changes is required.
- Dev environments: For debugging purposes, attach interactive dev environments (e.g. VS Code, JupyterLab, etc.) directly to running workflows.
Why use dstack?¶
dstack is the easiest and most flexible way for teams to automate ML workflows.
Are you exploring or preparing data? Training and validating models? Running apps?
Versioning and reusing artifacts? All of that is covered by
How does it work?¶
- Configure the cloud credentials locally (e.g. via
- Define ML workflows in YAML files inside the
.dstack/workflowsdirectory (within your project)
- Run ML workflows via the
dstack runCLI command
- Use other
dstackCLI commands to manage runs, artifacts, etc.
When you run a workflow via the
dstack CLI, it provisions the required compute resources (in a configured cloud
account), sets up environment (such as Python, Conda, CUDA, etc), fetches your code, downloads dependencies,
saves artifacts, and tears down compute resources.
Get started in 30 min¶
Having your first ML workflows up and running will take less than 30 min.