How dstack works
Typical ML workflows include multiple steps, e.g. pre-processing data, training, fine-tuning, validation, etc.
dstack, you can define ML workflows in a simple YAML format, and run them over a pool of
your own servers or spot instances in your cloud. All operations with
dstack are done from a developer-friendly CLI.
Workflows are defined in the
.dstack/workflows.yaml file inside a project's Git repository. Each workflow may have
a name, a Docker image, commands, what files (in the repository) and what other workflows it depends on, and
at what paths the files must be stored as output artifacts.
Each workflow may have variables and their default values. Variables for each workflow are defined in the
.dstack/varialbles.yaml file inside a project's Git repository. When a workflow is run, any of the variables
can be overridden. Learn more…
CLI is an abbreviation of Command Line Interface.
dstack's CLI can be installed via
It can be used from a terminal to invoke any command on
dstack, be it running workflows, browsing logs,
managing runners, etc. Learn more…
Runs are single instances of running workflows. If you submit a workflow to run (e.g. via
dstack CLI), the
corresponding run refers to the name of the workflow, the state of the Git repository (remote URL, branch name,
commit hash, and local changes), and to the values of the variables if any of them were overridden via the CLI.
If the submitted run refers to a workflow that depends on other workflows, for every workflow, the
schedules a separate job. If the submitted run refers to a workflow that doesn't have dependencies, then
dstack server schedules only one job. Each job refers to a single workflow that must be executed with a particular
set of variables, dependencies, and state of the repository.
Generally speaking, jobs are single units of work that can be executed by one machine (aka runner).
Runners are machines (either real or virtual) that host the
dstack-runner daemon. This daemon listens to the
dstack server for scheduled jobs. If a job is assigned to a particular runner, the daemon, based on the provided
information, fetches the repository, download the artifacts of the jobs that the given job depends on, and run
the commands of a given job as a Docker container. While the container is being running, the daemon reports
the logs to
dstack's logs storage and upload output artifacts of the job to the
dstack's artifact storage.
As a user of
dstack, you can either install the
dstack-runner daemon to your own servers to make a pool of
your own runners,
dstack credentials to your own cloud so
dstack can create runners on-demand using spot instances.
- You define
.dstack/variables.yamlfiles inside your project (must be a Git repository).
- You install the
- You either install
dstack-runnerdaemon on your servers, or use the
dstack aws configureto authorize
dstackto use your own cloud to create runners on-demand using spot instances.
- You use the
dstackCLI to run workflows, manage runs, jobs, logs, artifacts, runners.
- When a workflow is submitted via the CLI (e.g. via
dstack run) , the request is sent to the
dstackserver creates jobs for the submitted run, and assign them to available runners (either servers where you've installed
dstack-runneror on-demand spot instances in your cloud that you allowed to create).
- Runners execute assigned jobs, report their logs in real-time, and upload artifacts once the job is finished.