Skip to content

Welcome to the ZenML Api Docs

Pipelines

A ZenML pipeline is a sequence of tasks that execute in a specific order and yield artifacts. The artifacts are stored within the artifact store and indexed via the metadata store. Each individual task within a pipeline is known as a step. The standard pipelines within ZenML are designed to have easy interfaces to add pre-decided steps, with the order also pre-decided. Other sorts of pipelines can be created as well from scratch, building on the BasePipeline class.

Pipelines can be written as simple functions. They are created by using decorators appropriate to the specific use case you have. The moment it is run, a pipeline is compiled and passed directly to the orchestrator.

Materializers

Materializers are used to convert a ZenML artifact into a specific format. They are most often used to handle the input or output of ZenML steps, and can be extended by building on the BaseMaterializer class.

Runtime Configuration

Model Deployers

Model deployers are stack components responsible for online model serving. Online serving is the process of hosting and loading machine-learning models as part of a managed web service and providing access to the models through an API endpoint like HTTP or GRPC. Once deployed, you can send inference requests to the model through the web service's API and receive fast, low-latency responses.

Add a model deployer to your ZenML stack to be able to implement continuous model deployment pipelines that train models and continuously deploy them to a model prediction web service.

When present in a stack, the model deployer also acts as a registry for models that are served with ZenML. You can use the model deployer to list all models that are currently deployed for online inference or filtered according to a particular pipeline run or step, or to suspend, resume or delete an external model server managed through ZenML.

Steps

A step is a single piece or stage of a ZenML pipeline. Think of each step as being one of the nodes of a Directed Acyclic Graph (or DAG). Steps are responsible for one aspect of processing or interacting with the data / artifacts in the pipeline.

ZenML currently implements a basic step interface, but there will be other more customized interfaces (layered in a hierarchy) for specialized implementations. Conceptually, a Step is a discrete and independent part of a pipeline that is responsible for one particular aspect of data manipulation inside a ZenML pipeline.

Steps can be subclassed from the BaseStep class, or used via our @step decorator.

Artifact Stores

In ZenML, the inputs and outputs which go through any step is treated as an artifact and as its name suggests, an ArtifactStore is a place where these artifacts get stored.

Out of the box, ZenML comes with the BaseArtifactStore and LocalArtifactStore implementations. While the BaseArtifactStore establishes an interface for people who want to extend it to their needs, the LocalArtifactStore is a simple implementation for a local setup.

Moreover, additional artifact stores can be found in specific integrations modules, such as the GCPArtifactStore in the gcp integration and the AzureArtifactStore in the azure integration.

Constants

Config

The config module contains classes and functions that manage user-specific configuration. ZenML's configuration is stored in a file called config.yaml, located on the user's directory for configuration files. (The exact location differs from operating system to operating system.)

The GlobalConfiguration class is the main class in this module. It provides a Pydantic configuration object that is used to store and retrieve configuration. This GlobalConfiguration object handles the serialization and deserialization of the configuration options that are stored in the file in order to persist the configuration across sessions.

The ProfileConfiguration class is used to model the configuration of a Profile. A GlobalConfiguration object can contain multiple ProfileConfiguration instances.

Zen Server

The ZenServer is a simple webserver to let you collaborate on stacks via the network. It can be spun up in a background daemon from the command line using zenml server up and managed from the same command line group.

Using the ZenServer's stacks in your project just requires setting up a profile with rest store-type pointed to the url of the server.

Services

A service is a process or set of processes that outlive a pipeline run.

Environment

Post Execution

After executing a pipeline, the user needs to be able to fetch it from history and perform certain tasks. The post_execution submodule provides a set of interfaces with which the user can interact with artifacts, the pipeline, steps, and the post-run pipeline object.

Secrets Managers

Secret Manager

...

Logger

Utils

The utils module contains utility functions handling analytics, reading and writing YAML data as well as other general purpose functions.

Exceptions

ZenML specific exception definitions

Secret

A ZenML Secret is a grouping of key-value pairs. These are accessed and administered via the ZenML Secret Manager (a stack component).

Secrets are distinguished by having different schemas. An AWS SecretSchema, for example, has key-value pairs for AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY as well as an optional AWS_SESSION_TOKEN. If you don't specify a schema at the point of registration, ZenML will set the schema as ArbitrarySecretSchema, a kind of default schema where things that aren't attached to a grouping can be stored.

Orchestrators

An orchestrator is a special kind of backend that manages the running of each step of the pipeline. Orchestrators administer the actual pipeline runs. You can think of it as the 'root' of any pipeline job that you run during your experimentation.

ZenML supports a local orchestrator out of the box which allows you to run your pipelines in a local environment. We also support using Apache Airflow as the orchestrator to handle the steps of your pipeline.

Console

Step Operators

While an orchestrator defines how and where your entire pipeline runs, a step operator defines how and where an individual step runs. This can be useful in a variety of scenarios. An example could be if one step within a pipeline should run on a separate environment equipped with a GPU (like a trainer step).

Container Registries

A container registry is a store for (Docker) containers. A ZenML workflow involving a container registry would automatically containerize your code to be transported across stacks running remotely. As part of the deployment to the cluster, the ZenML base image would be downloaded (from a cloud container registry) and used as the basis for the deployed 'run'.

For instance, when you are running a local container-based stack, you would therefore have a local container registry which stores the container images you create that bundle up your pipeline code. You could also use a remote container registry like the Elastic Container Registry at AWS in a more production setting.

Repository

Io

The io module handles file operations for the ZenML package. It offers a standard interface for reading, writing and manipulating files and directories. It is heavily influenced and inspired by the io module of tfx.

Visualizers

The visualizers module offers a way of constructing and displaying visualizations of steps and pipeline results. The BaseVisualizer class is at the root of all the other visualizers, including options to view the results of pipeline runs, steps and pipelines themselves.

Artifacts

Artifacts are the data that power your experimentation and model training. It is actually steps that produce artifacts, which are then stored in the artifact store. Artifacts are written in the signature of a step like so:

    def my_step(first_artifact: int, second_artifact: torch.nn.Module -> int:
        # first_artifact is an integer
        # second_artifact is a torch.nn.Module
        return 1

Artifacts can be serialized and deserialized (i.e. written and read from the Artifact Store) in various ways like TFRecords or saved model pickles, depending on what the step produces.The serialization and deserialization logic of artifacts is defined by the appropriate Materializer.

Stack

The stack is essentially all the configuration for the infrastructure of your MLOps platform.

A stack is made up of multiple components. Some examples are:

  • An Artifact Store
  • A Metadata Store
  • An Orchestrator
  • A Step Operator (Optional)
  • A Container Registry (Optional)

Integrations

The ZenML integrations module contains sub-modules for each integration that we support. This includes orchestrators like Apache Airflow, visualization tools like the facets library, as well as deep learning libraries like PyTorch.

Zen Stores

ZenStores define ways to store ZenML relevant data locally or remotely.

Experiment Trackers

Experiment trackers let you track your ML experiments by logging the parameters and allowing you to compare between runs. In the ZenML world, every pipeline run is considered an experiment, and ZenML facilitates the storage of experiment results through ExperimentTracker stack components. This establishes a clear link between pipeline runs and experiments.

Feature Stores

Feature stores allow data teams to serve data via an offline store and an online low-latency store where data is kept in sync between the two. It also offers a centralized registry where features (and feature schemas) are stored for use within a team or wider organization.

As a data scientist working on training your model, your requirements for how you access your batch / 'offline' data will almost certainly be different from how you access that data as part of a real-time or online inference setting. Feast solves the problem of developing train-serve skew where those two sources of data diverge from each other.

Metadata Stores

The configuration of each pipeline, step, backend, and produced artifacts are all tracked within the metadata store. The metadata store is an SQL database, and can be sqlite or mysql.

Metadata are the pieces of information tracked about the pipelines, experiments and configurations that you are running with ZenML. Metadata are stored inside the metadata store.

Entrypoints

Enums