What is dstack?

dstack.ai is a platform for data scientists that helps them track their end or intermediate results and share them with the rest of the team.

The features of dstack.ai includes:

  • Managing data science artifacts

  • Building data science applications

  • Automating scientific routines

The core of the dstack.ai platform is its open-source framework dstack for Python and R for managing data science artifacts, and for building data science applications.

Managing data science artifacts

The end or intermediate results of a data science process include datasets, visualizations, ML models, text and binary files, etc.

A dstack server application serves as a storage engine to store these artifacts and track their revisions and meta-data (similar to Git but tailored to the needs of data scientists). The dstack-py and dstack-r packages serve as clients and provide the capabilities to push and pull these artifacts to and from the corresponding server instance.

Building data science applications

There are several general use-cases for such data science applications:

  1. Interactive reports – a set data visualizations and interactive widgets, combined using a certain layout

  2. Live dashboards – applications that fetch data from various data sources, turn it into visualizations and combine using a certain layout (not supported yet)

  3. Machine learning applications – applications that let users to interact with ML models (not supported yet)

Currently, dstack supports only Interactive reports. The support for Live dashboards and Machine learning applications is coming soon.

Check out this quick tutorial on how to build interactive reports with dstack in minutes.

Automating scientific routines

The dstack in-cloud and on-premises versions allow you to execute Python and R jobs right – either on demand or automatically at a regular schedule.

The jobs are ideal for processing data regularly and publishing the results, e.g. in the form of data visualizations, datasets, or other artifacts.