Prodigy

`zenml.integrations.prodigy`

Initialization of the Prodigy integration.

Attributes

`PRODIGY = 'prodigy'` `module-attribute`

`PRODIGY_ANNOTATOR_FLAVOR = 'prodigy'` `module-attribute`

Classes

`Flavor`

Class for ZenML Flavors.

Attributes

`config_class: Type[StackComponentConfig]` `abstractmethod` `property`

Returns StackComponentConfig config class.

Returns:

Type	Description
`Type[StackComponentConfig]`	The config class.

`config_schema: Dict[str, Any]` `property`

The config schema for a flavor.

Returns:

Type	Description
`Dict[str, Any]`	The config schema.

`docs_url: Optional[str]` `property`

A url to point at docs explaining this flavor.

Returns:

Type	Description
`Optional[str]`	A flavor docs url.

`implementation_class: Type[StackComponent]` `abstractmethod` `property`

Implementation class for this flavor.

Returns:

Type	Description
`Type[StackComponent]`	The implementation class for this flavor.

`logo_url: Optional[str]` `property`

A url to represent the flavor in the dashboard.

Returns:

Type	Description
`Optional[str]`	The flavor logo.

`name: str` `abstractmethod` `property`

The flavor name.

Returns:

Type	Description
`str`	The flavor name.

`sdk_docs_url: Optional[str]` `property`

A url to point at SDK docs explaining this flavor.

Returns:

Type	Description
`Optional[str]`	A flavor SDK docs url.

`service_connector_requirements: Optional[ServiceConnectorRequirements]` `property`

Service connector resource requirements for service connectors.

Specifies resource requirements that are used to filter the available service connector types that are compatible with this flavor.

Returns:

Type	Description
`Optional[ServiceConnectorRequirements]`	Requirements for compatible service connectors, if a service
`Optional[ServiceConnectorRequirements]`	connector is required for this flavor.

`type: StackComponentType` `abstractmethod` `property`

The stack component type.

Returns:

Type	Description
`StackComponentType`	The stack component type.

Functions

`from_model(flavor_model: FlavorResponse) -> Flavor` `classmethod`

Loads a flavor from a model.

Parameters:

Name	Type	Description	Default
`flavor_model`	`FlavorResponse`	The model to load from.	required

Raises:

Type	Description
`CustomFlavorImportError`	If the custom flavor can't be imported.
`ImportError`	If the flavor can't be imported.

Returns:

Type	Description
`Flavor`	The loaded flavor.

Source code in src/zenml/stack/flavor.py

@classmethod
def from_model(cls, flavor_model: FlavorResponse) -> "Flavor":
    """Loads a flavor from a model.

    Args:
        flavor_model: The model to load from.

    Raises:
        CustomFlavorImportError: If the custom flavor can't be imported.
        ImportError: If the flavor can't be imported.

    Returns:
        The loaded flavor.
    """
    try:
        flavor = source_utils.load(flavor_model.source)()
    except (ModuleNotFoundError, ImportError, NotImplementedError) as err:
        if flavor_model.is_custom:
            flavor_module, _ = flavor_model.source.rsplit(".", maxsplit=1)
            expected_file_path = os.path.join(
                source_utils.get_source_root(),
                flavor_module.replace(".", os.path.sep),
            )
            raise CustomFlavorImportError(
                f"Couldn't import custom flavor {flavor_model.name}: "
                f"{err}. Make sure the custom flavor class "
                f"`{flavor_model.source}` is importable. If it is part of "
                "a library, make sure it is installed. If "
                "it is a local code file, make sure it exists at "
                f"`{expected_file_path}.py`."
            )
        else:
            raise ImportError(
                f"Couldn't import flavor {flavor_model.name}: {err}"
            )
    return cast(Flavor, flavor)

`generate_default_docs_url() -> str`

Generate the doc urls for all inbuilt and integration flavors.

Note that this method is not going to be useful for custom flavors, which do not have any docs in the main zenml docs.

Returns:

Type	Description
`str`	The complete url to the zenml documentation

Source code in src/zenml/stack/flavor.py

def generate_default_docs_url(self) -> str:
    """Generate the doc urls for all inbuilt and integration flavors.

    Note that this method is not going to be useful for custom flavors,
    which do not have any docs in the main zenml docs.

    Returns:
        The complete url to the zenml documentation
    """
    from zenml import __version__

    component_type = self.type.plural.replace("_", "-")
    name = self.name.replace("_", "-")

    try:
        is_latest = is_latest_zenml_version()
    except RuntimeError:
        # We assume in error cases that we are on the latest version
        is_latest = True

    if is_latest:
        base = "https://docs.zenml.io"
    else:
        base = f"https://zenml-io.gitbook.io/zenml-legacy-documentation/v/{__version__}"
    return f"{base}/stack-components/{component_type}/{name}"

`generate_default_sdk_docs_url() -> str`

Generate SDK docs url for a flavor.

Returns:

Type	Description
`str`	The complete url to the zenml SDK docs

Source code in src/zenml/stack/flavor.py

def generate_default_sdk_docs_url(self) -> str:
    """Generate SDK docs url for a flavor.

    Returns:
        The complete url to the zenml SDK docs
    """
    from zenml import __version__

    base = f"https://sdkdocs.zenml.io/{__version__}"

    component_type = self.type.plural

    if "zenml.integrations" in self.__module__:
        # Get integration name out of module path which will look something
        #  like this "zenml.integrations.<integration>....
        integration = self.__module__.split(
            "zenml.integrations.", maxsplit=1
        )[1].split(".")[0]

        return (
            f"{base}/integration_code_docs"
            f"/integrations-{integration}/#{self.__module__}"
        )

    else:
        return (
            f"{base}/core_code_docs/core-{component_type}/"
            f"#{self.__module__}"
        )

`to_model(integration: Optional[str] = None, is_custom: bool = True) -> FlavorRequest`

Converts a flavor to a model.

Parameters:

Name	Type	Description	Default
`integration`	`Optional[str]`	The integration to use for the model.	`None`
`is_custom`	`bool`	Whether the flavor is a custom flavor.	`True`

Returns:

Type	Description
`FlavorRequest`	The model.

Source code in src/zenml/stack/flavor.py

def to_model(
    self,
    integration: Optional[str] = None,
    is_custom: bool = True,
) -> FlavorRequest:
    """Converts a flavor to a model.

    Args:
        integration: The integration to use for the model.
        is_custom: Whether the flavor is a custom flavor.

    Returns:
        The model.
    """
    connector_requirements = self.service_connector_requirements
    connector_type = (
        connector_requirements.connector_type
        if connector_requirements
        else None
    )
    resource_type = (
        connector_requirements.resource_type
        if connector_requirements
        else None
    )
    resource_id_attr = (
        connector_requirements.resource_id_attr
        if connector_requirements
        else None
    )

    model = FlavorRequest(
        name=self.name,
        type=self.type,
        source=source_utils.resolve(self.__class__).import_path,
        config_schema=self.config_schema,
        connector_type=connector_type,
        connector_resource_type=resource_type,
        connector_resource_id_attr=resource_id_attr,
        integration=integration,
        logo_url=self.logo_url,
        docs_url=self.docs_url,
        sdk_docs_url=self.sdk_docs_url,
        is_custom=is_custom,
    )
    return model

`Integration`

Base class for integration in ZenML.

Functions

`activate() -> None` `classmethod`

Abstract method to activate the integration.

Source code in src/zenml/integrations/integration.py

@classmethod
def activate(cls) -> None:
    """Abstract method to activate the integration."""

`check_installation() -> bool` `classmethod`

Method to check whether the required packages are installed.

Returns:

Type	Description
`bool`	True if all required packages are installed, False otherwise.

Source code in src/zenml/integrations/integration.py

@classmethod
def check_installation(cls) -> bool:
    """Method to check whether the required packages are installed.

    Returns:
        True if all required packages are installed, False otherwise.
    """
    for r in cls.get_requirements():
        try:
            # First check if the base package is installed
            dist = pkg_resources.get_distribution(r)

            # Next, check if the dependencies (including extras) are
            # installed
            deps: List[Requirement] = []

            _, extras = parse_requirement(r)
            if extras:
                extra_list = extras[1:-1].split(",")
                for extra in extra_list:
                    try:
                        requirements = dist.requires(extras=[extra])  # type: ignore[arg-type]
                    except pkg_resources.UnknownExtra as e:
                        logger.debug(f"Unknown extra: {str(e)}")
                        return False
                    deps.extend(requirements)
            else:
                deps = dist.requires()

            for ri in deps:
                try:
                    # Remove the "extra == ..." part from the requirement string
                    cleaned_req = re.sub(
                        r"; extra == \"\w+\"", "", str(ri)
                    )
                    pkg_resources.get_distribution(cleaned_req)
                except pkg_resources.DistributionNotFound as e:
                    logger.debug(
                        f"Unable to find required dependency "
                        f"'{e.req}' for requirement '{r}' "
                        f"necessary for integration '{cls.NAME}'."
                    )
                    return False
                except pkg_resources.VersionConflict as e:
                    logger.debug(
                        f"Package version '{e.dist}' does not match "
                        f"version '{e.req}' required by '{r}' "
                        f"necessary for integration '{cls.NAME}'."
                    )
                    return False

        except pkg_resources.DistributionNotFound as e:
            logger.debug(
                f"Unable to find required package '{e.req}' for "
                f"integration {cls.NAME}."
            )
            return False
        except pkg_resources.VersionConflict as e:
            logger.debug(
                f"Package version '{e.dist}' does not match version "
                f"'{e.req}' necessary for integration {cls.NAME}."
            )
            return False

    logger.debug(
        f"Integration {cls.NAME} is installed correctly with "
        f"requirements {cls.get_requirements()}."
    )
    return True

`flavors() -> List[Type[Flavor]]` `classmethod`

Abstract method to declare new stack component flavors.

Returns:

Type	Description
`List[Type[Flavor]]`	A list of new stack component flavors.

Source code in src/zenml/integrations/integration.py

@classmethod
def flavors(cls) -> List[Type[Flavor]]:
    """Abstract method to declare new stack component flavors.

    Returns:
        A list of new stack component flavors.
    """
    return []

`get_requirements(target_os: Optional[str] = None, python_version: Optional[str] = None) -> List[str]` `classmethod`

Method to get the requirements for the integration.

Parameters:

Name	Type	Description	Default
`target_os`	`Optional[str]`	The target operating system to get the requirements for.	`None`
`python_version`	`Optional[str]`	The Python version to use for the requirements.	`None`

Returns:

Type	Description
`List[str]`	A list of requirements.

Source code in src/zenml/integrations/integration.py

@classmethod
def get_requirements(
    cls,
    target_os: Optional[str] = None,
    python_version: Optional[str] = None,
) -> List[str]:
    """Method to get the requirements for the integration.

    Args:
        target_os: The target operating system to get the requirements for.
        python_version: The Python version to use for the requirements.

    Returns:
        A list of requirements.
    """
    return cls.REQUIREMENTS

`get_uninstall_requirements(target_os: Optional[str] = None) -> List[str]` `classmethod`

Method to get the uninstall requirements for the integration.

Parameters:

Name	Type	Description	Default
`target_os`	`Optional[str]`	The target operating system to get the requirements for.	`None`

Returns:

Type	Description
`List[str]`	A list of requirements.

Source code in src/zenml/integrations/integration.py

@classmethod
def get_uninstall_requirements(
    cls, target_os: Optional[str] = None
) -> List[str]:
    """Method to get the uninstall requirements for the integration.

    Args:
        target_os: The target operating system to get the requirements for.

    Returns:
        A list of requirements.
    """
    ret = []
    for each in cls.get_requirements(target_os=target_os):
        is_ignored = False
        for ignored in cls.REQUIREMENTS_IGNORED_ON_UNINSTALL:
            if each.startswith(ignored):
                is_ignored = True
                break
        if not is_ignored:
            ret.append(each)
    return ret

`plugin_flavors() -> List[Type[BasePluginFlavor]]` `classmethod`

Abstract method to declare new plugin flavors.

Returns:

Type	Description
`List[Type[BasePluginFlavor]]`	A list of new plugin flavors.

Source code in src/zenml/integrations/integration.py

@classmethod
def plugin_flavors(cls) -> List[Type["BasePluginFlavor"]]:
    """Abstract method to declare new plugin flavors.

    Returns:
        A list of new plugin flavors.
    """
    return []

`ProdigyIntegration`

Bases: Integration

Definition of Prodigy integration for ZenML.

Functions

`flavors() -> List[Type[Flavor]]` `classmethod`

Declare the stack component flavors for the Prodigy integration.

Returns:

Type	Description
`List[Type[Flavor]]`	List of stack component flavors for this integration.

Source code in src/zenml/integrations/prodigy/__init__.py

@classmethod
def flavors(cls) -> List[Type[Flavor]]:
    """Declare the stack component flavors for the Prodigy integration.

    Returns:
        List of stack component flavors for this integration.
    """
    from zenml.integrations.prodigy.flavors import (
        ProdigyAnnotatorFlavor,
    )

    return [ProdigyAnnotatorFlavor]

Modules

`annotators`

Initialization of the Prodigy annotators submodule.

Classes

`ProdigyAnnotator(name: str, id: UUID, config: StackComponentConfig, flavor: str, type: StackComponentType, user: Optional[UUID], created: datetime, updated: datetime, labels: Optional[Dict[str, Any]] = None, connector_requirements: Optional[ServiceConnectorRequirements] = None, connector: Optional[UUID] = None, connector_resource_id: Optional[str] = None, *args: Any, **kwargs: Any)`

Bases: BaseAnnotator, AuthenticationMixin

Class to interact with the Prodigy annotation interface.

Source code in src/zenml/stack/stack_component.py

def __init__(
    self,
    name: str,
    id: UUID,
    config: StackComponentConfig,
    flavor: str,
    type: StackComponentType,
    user: Optional[UUID],
    created: datetime,
    updated: datetime,
    labels: Optional[Dict[str, Any]] = None,
    connector_requirements: Optional[ServiceConnectorRequirements] = None,
    connector: Optional[UUID] = None,
    connector_resource_id: Optional[str] = None,
    *args: Any,
    **kwargs: Any,
):
    """Initializes a StackComponent.

    Args:
        name: The name of the component.
        id: The unique ID of the component.
        config: The config of the component.
        flavor: The flavor of the component.
        type: The type of the component.
        user: The ID of the user who created the component.
        created: The creation time of the component.
        updated: The last update time of the component.
        labels: The labels of the component.
        connector_requirements: The requirements for the connector.
        connector: The ID of a connector linked to the component.
        connector_resource_id: The custom resource ID to access through
            the connector.
        *args: Additional positional arguments.
        **kwargs: Additional keyword arguments.

    Raises:
        ValueError: If a secret reference is passed as name.
    """
    if secret_utils.is_secret_reference(name):
        raise ValueError(
            "Passing the `name` attribute of a stack component as a "
            "secret reference is not allowed."
        )

    self.id = id
    self.name = name
    self._config = config
    self.flavor = flavor
    self.type = type
    self.user = user
    self.created = created
    self.updated = updated
    self.labels = labels
    self.connector_requirements = connector_requirements
    self.connector = connector
    self.connector_resource_id = connector_resource_id
    self._connector_instance: Optional[ServiceConnector] = None

Attributes

config: ProdigyAnnotatorConfig property

Returns the ProdigyAnnotatorConfig config.

Returns:

Type	Description
`ProdigyAnnotatorConfig`	The configuration.

Functions

add_dataset(**kwargs: Any) -> Any

Registers a dataset for annotation.

Parameters:

Name	Type	Description	Default
`**kwargs`	`Any`	Additional keyword arguments to pass to the Prodigy client.	`{}`

Returns:

Type	Description
`Any`	A Prodigy list representing the dataset.

Raises:

Type	Description
`ValueError`	if 'dataset_name' and 'label_config' aren't provided.

Source code in src/zenml/integrations/prodigy/annotators/prodigy_annotator.py

def add_dataset(self, **kwargs: Any) -> Any:
    """Registers a dataset for annotation.

    Args:
        **kwargs: Additional keyword arguments to pass to the Prodigy client.

    Returns:
        A Prodigy list representing the dataset.

    Raises:
        ValueError: if 'dataset_name' and 'label_config' aren't provided.
    """
    db = self._get_db()
    dataset_kwargs = {"dataset_name": kwargs.get("dataset_name")}
    if not dataset_kwargs["dataset_name"]:
        raise ValueError("`dataset_name` keyword argument is required.")

    if kwargs.get("dataset_meta"):
        dataset_kwargs["dataset_meta"] = kwargs.get("dataset_meta")
    return db.add_dataset(**dataset_kwargs)

delete_dataset(**kwargs: Any) -> None

Deletes a dataset from the annotation interface.

Parameters:

Name	Type	Description	Default
`**kwargs`	`Any`	Additional keyword arguments to pass to the Prodigy client.	`{}`

Raises:

Type	Description
`ValueError`	If the dataset name is not provided or if the dataset does not exist.

Source code in src/zenml/integrations/prodigy/annotators/prodigy_annotator.py

def delete_dataset(self, **kwargs: Any) -> None:
    """Deletes a dataset from the annotation interface.

    Args:
        **kwargs: Additional keyword arguments to pass to the Prodigy
            client.

    Raises:
        ValueError: If the dataset name is not provided or if the dataset
            does not exist.
    """
    db = self._get_db()
    if not (dataset_name := kwargs.get("dataset_name")):
        raise ValueError("`dataset_name` keyword argument is required.")
    try:
        db.drop_dataset(name=dataset_name)
    except ProdigyError as e:
        # see https://support.prodi.gy/t/how-to-import-datasetdoesnotexist-error/7205
        if type(e).__name__ == "DatasetNotFound":
            raise ValueError(
                f"Dataset name '{dataset_name}' does not exist."
            ) from e

get_dataset(**kwargs: Any) -> Any

Gets the dataset metadata for the given name.

If you would like the labeled data, use get_labeled_data instead.

Parameters:

Name	Type	Description	Default
`**kwargs`	`Any`	Additional keyword arguments to pass to the Prodigy client.	`{}`

Returns:

Type	Description
`Any`	The metadata associated with a Prodigy dataset

Raises:

Type	Description
`ValueError`	If the dataset name is not provided or if the dataset does not exist.

Source code in src/zenml/integrations/prodigy/annotators/prodigy_annotator.py

def get_dataset(self, **kwargs: Any) -> Any:
    """Gets the dataset metadata for the given name.

    If you would like the labeled data, use `get_labeled_data` instead.

    Args:
        **kwargs: Additional keyword arguments to pass to the Prodigy client.

    Returns:
        The metadata associated with a Prodigy dataset

    Raises:
        ValueError: If the dataset name is not provided or if the dataset
            does not exist.
    """
    db = self._get_db()
    if dataset_name := kwargs.get("dataset_name"):
        try:
            return db.get_meta(name=dataset_name)
        except Exception as e:
            raise ValueError(
                f"Dataset name '{dataset_name}' does not exist."
            ) from e

get_dataset_names() -> List[str]

Gets the names of the datasets.

Returns:

Type	Description
`List[str]`	A list of dataset names.

Source code in src/zenml/integrations/prodigy/annotators/prodigy_annotator.py

def get_dataset_names(self) -> List[str]:
    """Gets the names of the datasets.

    Returns:
        A list of dataset names.
    """
    return self.get_datasets()

get_dataset_stats(dataset_name: str) -> Tuple[int, int]

Gets the statistics of the given dataset.

Parameters:

Name	Type	Description	Default
`dataset_name`	`str`	The name of the dataset.	required

Returns:

Type	Description
`Tuple[int, int]`	A tuple containing (labeled_task_count, unlabeled_task_count) for the dataset.

Raises:

Type	Description
`IndexError`	If the dataset does not exist.

Source code in src/zenml/integrations/prodigy/annotators/prodigy_annotator.py

def get_dataset_stats(self, dataset_name: str) -> Tuple[int, int]:
    """Gets the statistics of the given dataset.

    Args:
        dataset_name: The name of the dataset.

    Returns:
        A tuple containing (labeled_task_count, unlabeled_task_count) for
            the dataset.

    Raises:
        IndexError: If the dataset does not exist.
    """
    db = self._get_db()
    try:
        labeled_data_count = db.count_dataset(name=dataset_name)
    except ValueError as e:
        raise IndexError(
            f"Dataset {dataset_name} does not exist. Please use `zenml "
            f"annotator dataset list` to list the available datasets."
        ) from e
    return (labeled_data_count, 0)

get_datasets() -> List[Any]

Gets the datasets currently available for annotation.

Returns:

Type	Description
`List[Any]`	A list of datasets (str).

Source code in src/zenml/integrations/prodigy/annotators/prodigy_annotator.py

def get_datasets(self) -> List[Any]:
    """Gets the datasets currently available for annotation.

    Returns:
        A list of datasets (str).
    """
    datasets = self._get_db().datasets
    return cast(List[Any], datasets)

get_labeled_data(**kwargs: Any) -> Any

Gets the labeled data for the given dataset.

Parameters:

Name	Type	Description	Default
`**kwargs`	`Any`	Additional keyword arguments to pass to the Prodigy client.	`{}`

Returns:

Type	Description
`Any`	A list of all examples in the dataset serialized to the Prodigy Task format.

Raises:

Type	Description
`ValueError`	If the dataset name is not provided or if the dataset does not exist.

Source code in src/zenml/integrations/prodigy/annotators/prodigy_annotator.py

def get_labeled_data(self, **kwargs: Any) -> Any:
    """Gets the labeled data for the given dataset.

    Args:
        **kwargs: Additional keyword arguments to pass to the Prodigy client.

    Returns:
        A list of all examples in the dataset serialized to the
            Prodigy Task format.

    Raises:
        ValueError: If the dataset name is not provided or if the dataset
            does not exist.
    """
    if dataset_name := kwargs.get("dataset_name"):
        return self._get_db().get_dataset_examples(dataset_name)
    else:
        raise ValueError("`dataset_name` keyword argument is required.")

get_unlabeled_data(**kwargs: str) -> Any

Gets the unlabeled data for the given dataset.

Parameters:

Name	Type	Description	Default
`**kwargs`	`str`	Additional keyword arguments to pass to the Prodigy client.	`{}`

Raises:

Type	Description
`NotImplementedError`	Prodigy doesn't allow fetching unlabeled data.

Source code in src/zenml/integrations/prodigy/annotators/prodigy_annotator.py

def get_unlabeled_data(self, **kwargs: str) -> Any:
    """Gets the unlabeled data for the given dataset.

    Args:
        **kwargs: Additional keyword arguments to pass to the Prodigy client.

    Raises:
        NotImplementedError: Prodigy doesn't allow fetching unlabeled data.
    """
    raise NotImplementedError(
        "Prodigy doesn't allow fetching unlabeled data."
    )

get_url() -> str

Gets the top-level URL of the annotation interface.

Returns:

Type	Description
`str`	The URL of the annotation interface.

Source code in src/zenml/integrations/prodigy/annotators/prodigy_annotator.py

def get_url(self) -> str:
    """Gets the top-level URL of the annotation interface.

    Returns:
        The URL of the annotation interface.
    """
    instance_url = DEFAULT_LOCAL_INSTANCE_HOST
    port = DEFAULT_LOCAL_PRODIGY_PORT
    if self.config.custom_config_path:
        with open(self.config.custom_config_path, "r") as f:
            config = json.load(f)
        instance_url = config.get("instance_url", instance_url)
        port = config.get("port", port)
    return f"http://{instance_url}:{port}"

get_url_for_dataset(dataset_name: str) -> str

Gets the URL of the annotation interface for the given dataset.

Prodigy does not support dataset-specific URLs, so this method returns the top-level URL since that's what will be served for the user.

Parameters:

Name	Type	Description	Default
`dataset_name`	`str`	The name of the dataset. (Unuse)	required

Returns:

Type	Description
`str`	The URL of the annotation interface.

Source code in src/zenml/integrations/prodigy/annotators/prodigy_annotator.py

def get_url_for_dataset(self, dataset_name: str) -> str:
    """Gets the URL of the annotation interface for the given dataset.

    Prodigy does not support dataset-specific URLs, so this method returns
    the top-level URL since that's what will be served for the user.

    Args:
        dataset_name: The name of the dataset. (Unuse)

    Returns:
        The URL of the annotation interface.
    """
    return self.get_url()

launch(**kwargs: Any) -> None

Launches the annotation interface.

This method extracts the 'command' and additional config parameters from kwargs.

Parameters:

Name	Type	Description	Default
`**kwargs`	`Any`	Should include: - command: The full recipe command without "prodigy". - Any additional config parameters to overwrite the project-specific, global, and recipe config.	`{}`

Raises:

Type	Description
`ValueError`	If the 'command' keyword argument is not provided.

Source code in src/zenml/integrations/prodigy/annotators/prodigy_annotator.py

def launch(self, **kwargs: Any) -> None:
    """Launches the annotation interface.

    This method extracts the 'command' and additional config
        parameters from kwargs.

    Args:
        **kwargs: Should include:
            - command: The full recipe command without "prodigy".
            - Any additional config parameters to overwrite the
                project-specific, global, and recipe config.

    Raises:
        ValueError: If the 'command' keyword argument is not provided.
    """
    command = kwargs.get("command")
    if not command:
        raise ValueError(
            "The 'command' keyword argument is required for launching Prodigy."
        )

    # Remove 'command' from kwargs to pass the rest as config parameters
    config = {
        key: value for key, value in kwargs.items() if key != "command"
    }
    prodigy.serve(command=command, **config)

Modules

`prodigy_annotator`

Implementation of the Prodigy annotation integration.

Classes

ProdigyAnnotator(name: str, id: UUID, config: StackComponentConfig, flavor: str, type: StackComponentType, user: Optional[UUID], created: datetime, updated: datetime, labels: Optional[Dict[str, Any]] = None, connector_requirements: Optional[ServiceConnectorRequirements] = None, connector: Optional[UUID] = None, connector_resource_id: Optional[str] = None, *args: Any, **kwargs: Any)

Bases: BaseAnnotator, AuthenticationMixin

Class to interact with the Prodigy annotation interface.

Source code in src/zenml/stack/stack_component.py

def __init__(
    self,
    name: str,
    id: UUID,
    config: StackComponentConfig,
    flavor: str,
    type: StackComponentType,
    user: Optional[UUID],
    created: datetime,
    updated: datetime,
    labels: Optional[Dict[str, Any]] = None,
    connector_requirements: Optional[ServiceConnectorRequirements] = None,
    connector: Optional[UUID] = None,
    connector_resource_id: Optional[str] = None,
    *args: Any,
    **kwargs: Any,
):
    """Initializes a StackComponent.

    Args:
        name: The name of the component.
        id: The unique ID of the component.
        config: The config of the component.
        flavor: The flavor of the component.
        type: The type of the component.
        user: The ID of the user who created the component.
        created: The creation time of the component.
        updated: The last update time of the component.
        labels: The labels of the component.
        connector_requirements: The requirements for the connector.
        connector: The ID of a connector linked to the component.
        connector_resource_id: The custom resource ID to access through
            the connector.
        *args: Additional positional arguments.
        **kwargs: Additional keyword arguments.

    Raises:
        ValueError: If a secret reference is passed as name.
    """
    if secret_utils.is_secret_reference(name):
        raise ValueError(
            "Passing the `name` attribute of a stack component as a "
            "secret reference is not allowed."
        )

    self.id = id
    self.name = name
    self._config = config
    self.flavor = flavor
    self.type = type
    self.user = user
    self.created = created
    self.updated = updated
    self.labels = labels
    self.connector_requirements = connector_requirements
    self.connector = connector
    self.connector_resource_id = connector_resource_id
    self._connector_instance: Optional[ServiceConnector] = None

Attributes

config: ProdigyAnnotatorConfig property

Returns the ProdigyAnnotatorConfig config.

Returns:

Type	Description
`ProdigyAnnotatorConfig`	The configuration.

Functions

add_dataset(**kwargs: Any) -> Any

Registers a dataset for annotation.

Parameters:

Name	Type	Description	Default
`**kwargs`	`Any`	Additional keyword arguments to pass to the Prodigy client.	`{}`

Returns:

Type	Description
`Any`	A Prodigy list representing the dataset.

Raises:

Type	Description
`ValueError`	if 'dataset_name' and 'label_config' aren't provided.

Source code in src/zenml/integrations/prodigy/annotators/prodigy_annotator.py

def add_dataset(self, **kwargs: Any) -> Any:
    """Registers a dataset for annotation.

    Args:
        **kwargs: Additional keyword arguments to pass to the Prodigy client.

    Returns:
        A Prodigy list representing the dataset.

    Raises:
        ValueError: if 'dataset_name' and 'label_config' aren't provided.
    """
    db = self._get_db()
    dataset_kwargs = {"dataset_name": kwargs.get("dataset_name")}
    if not dataset_kwargs["dataset_name"]:
        raise ValueError("`dataset_name` keyword argument is required.")

    if kwargs.get("dataset_meta"):
        dataset_kwargs["dataset_meta"] = kwargs.get("dataset_meta")
    return db.add_dataset(**dataset_kwargs)

delete_dataset(**kwargs: Any) -> None

Deletes a dataset from the annotation interface.

Parameters:

Name	Type	Description	Default
`**kwargs`	`Any`	Additional keyword arguments to pass to the Prodigy client.	`{}`

Raises:

Type	Description
`ValueError`	If the dataset name is not provided or if the dataset does not exist.

Source code in src/zenml/integrations/prodigy/annotators/prodigy_annotator.py

def delete_dataset(self, **kwargs: Any) -> None:
    """Deletes a dataset from the annotation interface.

    Args:
        **kwargs: Additional keyword arguments to pass to the Prodigy
            client.

    Raises:
        ValueError: If the dataset name is not provided or if the dataset
            does not exist.
    """
    db = self._get_db()
    if not (dataset_name := kwargs.get("dataset_name")):
        raise ValueError("`dataset_name` keyword argument is required.")
    try:
        db.drop_dataset(name=dataset_name)
    except ProdigyError as e:
        # see https://support.prodi.gy/t/how-to-import-datasetdoesnotexist-error/7205
        if type(e).__name__ == "DatasetNotFound":
            raise ValueError(
                f"Dataset name '{dataset_name}' does not exist."
            ) from e

get_dataset(**kwargs: Any) -> Any

Gets the dataset metadata for the given name.

If you would like the labeled data, use get_labeled_data instead.

Parameters:

Name	Type	Description	Default
`**kwargs`	`Any`	Additional keyword arguments to pass to the Prodigy client.	`{}`

Returns:

Type	Description
`Any`	The metadata associated with a Prodigy dataset

Raises:

Type	Description
`ValueError`	If the dataset name is not provided or if the dataset does not exist.

Source code in src/zenml/integrations/prodigy/annotators/prodigy_annotator.py

def get_dataset(self, **kwargs: Any) -> Any:
    """Gets the dataset metadata for the given name.

    If you would like the labeled data, use `get_labeled_data` instead.

    Args:
        **kwargs: Additional keyword arguments to pass to the Prodigy client.

    Returns:
        The metadata associated with a Prodigy dataset

    Raises:
        ValueError: If the dataset name is not provided or if the dataset
            does not exist.
    """
    db = self._get_db()
    if dataset_name := kwargs.get("dataset_name"):
        try:
            return db.get_meta(name=dataset_name)
        except Exception as e:
            raise ValueError(
                f"Dataset name '{dataset_name}' does not exist."
            ) from e

get_dataset_names() -> List[str]

Gets the names of the datasets.

Returns:

Type	Description
`List[str]`	A list of dataset names.

Source code in src/zenml/integrations/prodigy/annotators/prodigy_annotator.py

def get_dataset_names(self) -> List[str]:
    """Gets the names of the datasets.

    Returns:
        A list of dataset names.
    """
    return self.get_datasets()

get_dataset_stats(dataset_name: str) -> Tuple[int, int]

Gets the statistics of the given dataset.

Parameters:

Name	Type	Description	Default
`dataset_name`	`str`	The name of the dataset.	required

Returns:

Type	Description
`Tuple[int, int]`	A tuple containing (labeled_task_count, unlabeled_task_count) for the dataset.

Raises:

Type	Description
`IndexError`	If the dataset does not exist.

Source code in src/zenml/integrations/prodigy/annotators/prodigy_annotator.py

def get_dataset_stats(self, dataset_name: str) -> Tuple[int, int]:
    """Gets the statistics of the given dataset.

    Args:
        dataset_name: The name of the dataset.

    Returns:
        A tuple containing (labeled_task_count, unlabeled_task_count) for
            the dataset.

    Raises:
        IndexError: If the dataset does not exist.
    """
    db = self._get_db()
    try:
        labeled_data_count = db.count_dataset(name=dataset_name)
    except ValueError as e:
        raise IndexError(
            f"Dataset {dataset_name} does not exist. Please use `zenml "
            f"annotator dataset list` to list the available datasets."
        ) from e
    return (labeled_data_count, 0)

get_datasets() -> List[Any]

Gets the datasets currently available for annotation.

Returns:

Type	Description
`List[Any]`	A list of datasets (str).

Source code in src/zenml/integrations/prodigy/annotators/prodigy_annotator.py

def get_datasets(self) -> List[Any]:
    """Gets the datasets currently available for annotation.

    Returns:
        A list of datasets (str).
    """
    datasets = self._get_db().datasets
    return cast(List[Any], datasets)

get_labeled_data(**kwargs: Any) -> Any

Gets the labeled data for the given dataset.

Parameters:

Name	Type	Description	Default
`**kwargs`	`Any`	Additional keyword arguments to pass to the Prodigy client.	`{}`

Returns:

Type	Description
`Any`	A list of all examples in the dataset serialized to the Prodigy Task format.

Raises:

Type	Description
`ValueError`	If the dataset name is not provided or if the dataset does not exist.

Source code in src/zenml/integrations/prodigy/annotators/prodigy_annotator.py

def get_labeled_data(self, **kwargs: Any) -> Any:
    """Gets the labeled data for the given dataset.

    Args:
        **kwargs: Additional keyword arguments to pass to the Prodigy client.

    Returns:
        A list of all examples in the dataset serialized to the
            Prodigy Task format.

    Raises:
        ValueError: If the dataset name is not provided or if the dataset
            does not exist.
    """
    if dataset_name := kwargs.get("dataset_name"):
        return self._get_db().get_dataset_examples(dataset_name)
    else:
        raise ValueError("`dataset_name` keyword argument is required.")

get_unlabeled_data(**kwargs: str) -> Any

Gets the unlabeled data for the given dataset.

Parameters:

Name	Type	Description	Default
`**kwargs`	`str`	Additional keyword arguments to pass to the Prodigy client.	`{}`

Raises:

Type	Description
`NotImplementedError`	Prodigy doesn't allow fetching unlabeled data.

Source code in src/zenml/integrations/prodigy/annotators/prodigy_annotator.py

def get_unlabeled_data(self, **kwargs: str) -> Any:
    """Gets the unlabeled data for the given dataset.

    Args:
        **kwargs: Additional keyword arguments to pass to the Prodigy client.

    Raises:
        NotImplementedError: Prodigy doesn't allow fetching unlabeled data.
    """
    raise NotImplementedError(
        "Prodigy doesn't allow fetching unlabeled data."
    )

get_url() -> str

Gets the top-level URL of the annotation interface.

Returns:

Type	Description
`str`	The URL of the annotation interface.

Source code in src/zenml/integrations/prodigy/annotators/prodigy_annotator.py

def get_url(self) -> str:
    """Gets the top-level URL of the annotation interface.

    Returns:
        The URL of the annotation interface.
    """
    instance_url = DEFAULT_LOCAL_INSTANCE_HOST
    port = DEFAULT_LOCAL_PRODIGY_PORT
    if self.config.custom_config_path:
        with open(self.config.custom_config_path, "r") as f:
            config = json.load(f)
        instance_url = config.get("instance_url", instance_url)
        port = config.get("port", port)
    return f"http://{instance_url}:{port}"

get_url_for_dataset(dataset_name: str) -> str

Gets the URL of the annotation interface for the given dataset.

Prodigy does not support dataset-specific URLs, so this method returns the top-level URL since that's what will be served for the user.

Parameters:

Name	Type	Description	Default
`dataset_name`	`str`	The name of the dataset. (Unuse)	required

Returns:

Type	Description
`str`	The URL of the annotation interface.

Source code in src/zenml/integrations/prodigy/annotators/prodigy_annotator.py

def get_url_for_dataset(self, dataset_name: str) -> str:
    """Gets the URL of the annotation interface for the given dataset.

    Prodigy does not support dataset-specific URLs, so this method returns
    the top-level URL since that's what will be served for the user.

    Args:
        dataset_name: The name of the dataset. (Unuse)

    Returns:
        The URL of the annotation interface.
    """
    return self.get_url()

launch(**kwargs: Any) -> None

Launches the annotation interface.

This method extracts the 'command' and additional config parameters from kwargs.

Parameters:

Name	Type	Description	Default
`**kwargs`	`Any`	Should include: - command: The full recipe command without "prodigy". - Any additional config parameters to overwrite the project-specific, global, and recipe config.	`{}`

Raises:

Type	Description
`ValueError`	If the 'command' keyword argument is not provided.

Source code in src/zenml/integrations/prodigy/annotators/prodigy_annotator.py

def launch(self, **kwargs: Any) -> None:
    """Launches the annotation interface.

    This method extracts the 'command' and additional config
        parameters from kwargs.

    Args:
        **kwargs: Should include:
            - command: The full recipe command without "prodigy".
            - Any additional config parameters to overwrite the
                project-specific, global, and recipe config.

    Raises:
        ValueError: If the 'command' keyword argument is not provided.
    """
    command = kwargs.get("command")
    if not command:
        raise ValueError(
            "The 'command' keyword argument is required for launching Prodigy."
        )

    # Remove 'command' from kwargs to pass the rest as config parameters
    config = {
        key: value for key, value in kwargs.items() if key != "command"
    }
    prodigy.serve(command=command, **config)

Functions

`flavors`

Prodigy integration flavors.

Classes

`ProdigyAnnotatorConfig(warn_about_plain_text_secrets: bool = False, **kwargs: Any)`

Bases: BaseAnnotatorConfig, AuthenticationConfigMixin

Config for the Prodigy annotator.

See https://prodi.gy/docs/install#config for more on custom config files, but this allows you to override the default Prodigy config.

Attributes:

Name	Type	Description
`custom_config_path`	`Optional[str]`	The path to a custom config file for Prodigy.

Source code in src/zenml/stack/stack_component.py

def __init__(
    self, warn_about_plain_text_secrets: bool = False, **kwargs: Any
) -> None:
    """Ensures that secret references don't clash with pydantic validation.

    StackComponents allow the specification of all their string attributes
    using secret references of the form `{{secret_name.key}}`. This however
    is only possible when the stack component does not perform any explicit
    validation of this attribute using pydantic validators. If this were
    the case, the validation would run on the secret reference and would
    fail or in the worst case, modify the secret reference and lead to
    unexpected behavior. This method ensures that no attributes that require
    custom pydantic validation are set as secret references.

    Args:
        warn_about_plain_text_secrets: If true, then warns about using
            plain-text secrets.
        **kwargs: Arguments to initialize this stack component.

    Raises:
        ValueError: If an attribute that requires custom pydantic validation
            is passed as a secret reference, or if the `name` attribute
            was passed as a secret reference.
    """
    for key, value in kwargs.items():
        try:
            field = self.__class__.model_fields[key]
        except KeyError:
            # Value for a private attribute or non-existing field, this
            # will fail during the upcoming pydantic validation
            continue

        if value is None:
            continue

        if not secret_utils.is_secret_reference(value):
            if (
                secret_utils.is_secret_field(field)
                and warn_about_plain_text_secrets
            ):
                logger.warning(
                    "You specified a plain-text value for the sensitive "
                    f"attribute `{key}` for a `{self.__class__.__name__}` "
                    "stack component. This is currently only a warning, "
                    "but future versions of ZenML will require you to pass "
                    "in sensitive information as secrets. Check out the "
                    "documentation on how to configure your stack "
                    "components with secrets here: "
                    "https://docs.zenml.io/getting-started/deploying-zenml/secret-management"
                )
            continue

        if pydantic_utils.has_validators(
            pydantic_class=self.__class__, field_name=key
        ):
            raise ValueError(
                f"Passing the stack component attribute `{key}` as a "
                "secret reference is not allowed as additional validation "
                "is required for this attribute."
            )

    super().__init__(**kwargs)

`ProdigyAnnotatorFlavor`

Bases: BaseAnnotatorFlavor

Prodigy annotator flavor.

Attributes

config_class: Type[ProdigyAnnotatorConfig] property

Returns ProdigyAnnotatorConfig config class.

Returns:

Type	Description
`Type[ProdigyAnnotatorConfig]`	The config class.

docs_url: Optional[str] property

A url to point at docs explaining this flavor.

Returns:

Type	Description
`Optional[str]`	A flavor docs url.

implementation_class: Type[ProdigyAnnotator] property

Implementation class for this flavor.

Returns:

Type	Description
`Type[ProdigyAnnotator]`	The implementation class.

logo_url: str property

A url to represent the flavor in the dashboard.

Returns:

Type	Description
`str`	The flavor logo.

name: str property

Name of the flavor.

Returns:

Type	Description
`str`	The name of the flavor.

sdk_docs_url: Optional[str] property

A url to point at SDK docs explaining this flavor.

Returns:

Type	Description
`Optional[str]`	A flavor SDK docs url.

Modules

`prodigy_annotator_flavor`

Prodigy annotator flavor.

Classes

ProdigyAnnotatorConfig(warn_about_plain_text_secrets: bool = False, **kwargs: Any)

Bases: BaseAnnotatorConfig, AuthenticationConfigMixin

Config for the Prodigy annotator.

See https://prodi.gy/docs/install#config for more on custom config files, but this allows you to override the default Prodigy config.

Attributes:

Name	Type	Description
`custom_config_path`	`Optional[str]`	The path to a custom config file for Prodigy.

Source code in src/zenml/stack/stack_component.py

def __init__(
    self, warn_about_plain_text_secrets: bool = False, **kwargs: Any
) -> None:
    """Ensures that secret references don't clash with pydantic validation.

    StackComponents allow the specification of all their string attributes
    using secret references of the form `{{secret_name.key}}`. This however
    is only possible when the stack component does not perform any explicit
    validation of this attribute using pydantic validators. If this were
    the case, the validation would run on the secret reference and would
    fail or in the worst case, modify the secret reference and lead to
    unexpected behavior. This method ensures that no attributes that require
    custom pydantic validation are set as secret references.

    Args:
        warn_about_plain_text_secrets: If true, then warns about using
            plain-text secrets.
        **kwargs: Arguments to initialize this stack component.

    Raises:
        ValueError: If an attribute that requires custom pydantic validation
            is passed as a secret reference, or if the `name` attribute
            was passed as a secret reference.
    """
    for key, value in kwargs.items():
        try:
            field = self.__class__.model_fields[key]
        except KeyError:
            # Value for a private attribute or non-existing field, this
            # will fail during the upcoming pydantic validation
            continue

        if value is None:
            continue

        if not secret_utils.is_secret_reference(value):
            if (
                secret_utils.is_secret_field(field)
                and warn_about_plain_text_secrets
            ):
                logger.warning(
                    "You specified a plain-text value for the sensitive "
                    f"attribute `{key}` for a `{self.__class__.__name__}` "
                    "stack component. This is currently only a warning, "
                    "but future versions of ZenML will require you to pass "
                    "in sensitive information as secrets. Check out the "
                    "documentation on how to configure your stack "
                    "components with secrets here: "
                    "https://docs.zenml.io/getting-started/deploying-zenml/secret-management"
                )
            continue

        if pydantic_utils.has_validators(
            pydantic_class=self.__class__, field_name=key
        ):
            raise ValueError(
                f"Passing the stack component attribute `{key}` as a "
                "secret reference is not allowed as additional validation "
                "is required for this attribute."
            )

    super().__init__(**kwargs)

ProdigyAnnotatorFlavor

Bases: BaseAnnotatorFlavor

Prodigy annotator flavor.

Attributes

config_class: Type[ProdigyAnnotatorConfig] property

Returns ProdigyAnnotatorConfig config class.

Returns:

Type	Description
`Type[ProdigyAnnotatorConfig]`	The config class.

docs_url: Optional[str] property

A url to point at docs explaining this flavor.

Returns:

Type	Description
`Optional[str]`	A flavor docs url.

implementation_class: Type[ProdigyAnnotator] property

Implementation class for this flavor.

Returns:

Type	Description
`Type[ProdigyAnnotator]`	The implementation class.

logo_url: str property

A url to represent the flavor in the dashboard.

Returns:

Type	Description
`str`	The flavor logo.

name: str property

Name of the flavor.

Returns:

Type	Description
`str`	The name of the flavor.

sdk_docs_url: Optional[str] property

A url to point at SDK docs explaining this flavor.

Returns:

Type	Description
`Optional[str]`	A flavor SDK docs url.

Prodigy

zenml.integrations.prodigy

Attributes

PRODIGY = 'prodigy' module-attribute

PRODIGY_ANNOTATOR_FLAVOR = 'prodigy' module-attribute

Classes

Flavor

Attributes

config_class: Type[StackComponentConfig] abstractmethod property

config_schema: Dict[str, Any] property

docs_url: Optional[str] property

implementation_class: Type[StackComponent] abstractmethod property

logo_url: Optional[str] property

name: str abstractmethod property

sdk_docs_url: Optional[str] property

service_connector_requirements: Optional[ServiceConnectorRequirements] property

type: StackComponentType abstractmethod property

Functions

from_model(flavor_model: FlavorResponse) -> Flavor classmethod

generate_default_docs_url() -> str

generate_default_sdk_docs_url() -> str

to_model(integration: Optional[str] = None, is_custom: bool = True) -> FlavorRequest

Integration

Functions

activate() -> None classmethod

check_installation() -> bool classmethod

flavors() -> List[Type[Flavor]] classmethod

get_requirements(target_os: Optional[str] = None, python_version: Optional[str] = None) -> List[str] classmethod

get_uninstall_requirements(target_os: Optional[str] = None) -> List[str] classmethod

plugin_flavors() -> List[Type[BasePluginFlavor]] classmethod

ProdigyIntegration

Functions

flavors() -> List[Type[Flavor]] classmethod

Modules

annotators

Classes

Modules

prodigy_annotator

flavors

Classes

ProdigyAnnotatorConfig(warn_about_plain_text_secrets: bool = False, **kwargs: Any)

ProdigyAnnotatorFlavor

Modules

prodigy_annotator_flavor

`zenml.integrations.prodigy`

`PRODIGY = 'prodigy'` `module-attribute`

`PRODIGY_ANNOTATOR_FLAVOR = 'prodigy'` `module-attribute`

`Flavor`

`config_class: Type[StackComponentConfig]` `abstractmethod` `property`

`config_schema: Dict[str, Any]` `property`

`docs_url: Optional[str]` `property`

`implementation_class: Type[StackComponent]` `abstractmethod` `property`

`logo_url: Optional[str]` `property`

`name: str` `abstractmethod` `property`

`sdk_docs_url: Optional[str]` `property`

`service_connector_requirements: Optional[ServiceConnectorRequirements]` `property`

`type: StackComponentType` `abstractmethod` `property`

`from_model(flavor_model: FlavorResponse) -> Flavor` `classmethod`

`generate_default_docs_url() -> str`

`generate_default_sdk_docs_url() -> str`

`to_model(integration: Optional[str] = None, is_custom: bool = True) -> FlavorRequest`

`Integration`

`activate() -> None` `classmethod`

`check_installation() -> bool` `classmethod`

`flavors() -> List[Type[Flavor]]` `classmethod`

`get_requirements(target_os: Optional[str] = None, python_version: Optional[str] = None) -> List[str]` `classmethod`

`get_uninstall_requirements(target_os: Optional[str] = None) -> List[str]` `classmethod`

`plugin_flavors() -> List[Type[BasePluginFlavor]]` `classmethod`

`ProdigyIntegration`

`flavors() -> List[Type[Flavor]]` `classmethod`

`annotators`

`prodigy_annotator`

`flavors`

`ProdigyAnnotatorConfig(warn_about_plain_text_secrets: bool = False, **kwargs: Any)`

`ProdigyAnnotatorFlavor`

`prodigy_annotator_flavor`