Label Studio

`zenml.integrations.label_studio`

Initialization of the Label Studio integration.

Attributes

`LABEL_STUDIO = 'label_studio'` `module-attribute`

`LABEL_STUDIO_ANNOTATOR_FLAVOR = 'label_studio'` `module-attribute`

Classes

`Flavor`

Class for ZenML Flavors.

Attributes

`config_class: Type[StackComponentConfig]` `abstractmethod` `property`

Returns StackComponentConfig config class.

Returns:

Type	Description
`Type[StackComponentConfig]`	The config class.

`config_schema: Dict[str, Any]` `property`

The config schema for a flavor.

Returns:

Type	Description
`Dict[str, Any]`	The config schema.

`docs_url: Optional[str]` `property`

A url to point at docs explaining this flavor.

Returns:

Type	Description
`Optional[str]`	A flavor docs url.

`implementation_class: Type[StackComponent]` `abstractmethod` `property`

Implementation class for this flavor.

Returns:

Type	Description
`Type[StackComponent]`	The implementation class for this flavor.

`logo_url: Optional[str]` `property`

A url to represent the flavor in the dashboard.

Returns:

Type	Description
`Optional[str]`	The flavor logo.

`name: str` `abstractmethod` `property`

The flavor name.

Returns:

Type	Description
`str`	The flavor name.

`sdk_docs_url: Optional[str]` `property`

A url to point at SDK docs explaining this flavor.

Returns:

Type	Description
`Optional[str]`	A flavor SDK docs url.

`service_connector_requirements: Optional[ServiceConnectorRequirements]` `property`

Service connector resource requirements for service connectors.

Specifies resource requirements that are used to filter the available service connector types that are compatible with this flavor.

Returns:

Type	Description
`Optional[ServiceConnectorRequirements]`	Requirements for compatible service connectors, if a service
`Optional[ServiceConnectorRequirements]`	connector is required for this flavor.

`type: StackComponentType` `abstractmethod` `property`

The stack component type.

Returns:

Type	Description
`StackComponentType`	The stack component type.

Functions

`from_model(flavor_model: FlavorResponse) -> Flavor` `classmethod`

Loads a flavor from a model.

Parameters:

Name	Type	Description	Default
`flavor_model`	`FlavorResponse`	The model to load from.	required

Raises:

Type	Description
`CustomFlavorImportError`	If the custom flavor can't be imported.
`ImportError`	If the flavor can't be imported.

Returns:

Type	Description
`Flavor`	The loaded flavor.

Source code in src/zenml/stack/flavor.py

@classmethod
def from_model(cls, flavor_model: FlavorResponse) -> "Flavor":
    """Loads a flavor from a model.

    Args:
        flavor_model: The model to load from.

    Raises:
        CustomFlavorImportError: If the custom flavor can't be imported.
        ImportError: If the flavor can't be imported.

    Returns:
        The loaded flavor.
    """
    try:
        flavor = source_utils.load(flavor_model.source)()
    except (ModuleNotFoundError, ImportError, NotImplementedError) as err:
        if flavor_model.is_custom:
            flavor_module, _ = flavor_model.source.rsplit(".", maxsplit=1)
            expected_file_path = os.path.join(
                source_utils.get_source_root(),
                flavor_module.replace(".", os.path.sep),
            )
            raise CustomFlavorImportError(
                f"Couldn't import custom flavor {flavor_model.name}: "
                f"{err}. Make sure the custom flavor class "
                f"`{flavor_model.source}` is importable. If it is part of "
                "a library, make sure it is installed. If "
                "it is a local code file, make sure it exists at "
                f"`{expected_file_path}.py`."
            )
        else:
            raise ImportError(
                f"Couldn't import flavor {flavor_model.name}: {err}"
            )
    return cast(Flavor, flavor)

`generate_default_docs_url() -> str`

Generate the doc urls for all inbuilt and integration flavors.

Note that this method is not going to be useful for custom flavors, which do not have any docs in the main zenml docs.

Returns:

Type	Description
`str`	The complete url to the zenml documentation

Source code in src/zenml/stack/flavor.py

def generate_default_docs_url(self) -> str:
    """Generate the doc urls for all inbuilt and integration flavors.

    Note that this method is not going to be useful for custom flavors,
    which do not have any docs in the main zenml docs.

    Returns:
        The complete url to the zenml documentation
    """
    from zenml import __version__

    component_type = self.type.plural.replace("_", "-")
    name = self.name.replace("_", "-")

    try:
        is_latest = is_latest_zenml_version()
    except RuntimeError:
        # We assume in error cases that we are on the latest version
        is_latest = True

    if is_latest:
        base = "https://docs.zenml.io"
    else:
        base = f"https://zenml-io.gitbook.io/zenml-legacy-documentation/v/{__version__}"
    return f"{base}/stack-components/{component_type}/{name}"

`generate_default_sdk_docs_url() -> str`

Generate SDK docs url for a flavor.

Returns:

Type	Description
`str`	The complete url to the zenml SDK docs

Source code in src/zenml/stack/flavor.py

def generate_default_sdk_docs_url(self) -> str:
    """Generate SDK docs url for a flavor.

    Returns:
        The complete url to the zenml SDK docs
    """
    from zenml import __version__

    base = f"https://sdkdocs.zenml.io/{__version__}"

    component_type = self.type.plural

    if "zenml.integrations" in self.__module__:
        # Get integration name out of module path which will look something
        #  like this "zenml.integrations.<integration>....
        integration = self.__module__.split(
            "zenml.integrations.", maxsplit=1
        )[1].split(".")[0]

        return (
            f"{base}/integration_code_docs"
            f"/integrations-{integration}/#{self.__module__}"
        )

    else:
        return (
            f"{base}/core_code_docs/core-{component_type}/"
            f"#{self.__module__}"
        )

`to_model(integration: Optional[str] = None, is_custom: bool = True) -> FlavorRequest`

Converts a flavor to a model.

Parameters:

Name	Type	Description	Default
`integration`	`Optional[str]`	The integration to use for the model.	`None`
`is_custom`	`bool`	Whether the flavor is a custom flavor. Custom flavors are then scoped by user and workspace	`True`

Returns:

Type	Description
`FlavorRequest`	The model.

Source code in src/zenml/stack/flavor.py

def to_model(
    self,
    integration: Optional[str] = None,
    is_custom: bool = True,
) -> FlavorRequest:
    """Converts a flavor to a model.

    Args:
        integration: The integration to use for the model.
        is_custom: Whether the flavor is a custom flavor. Custom flavors
            are then scoped by user and workspace

    Returns:
        The model.
    """
    connector_requirements = self.service_connector_requirements
    connector_type = (
        connector_requirements.connector_type
        if connector_requirements
        else None
    )
    resource_type = (
        connector_requirements.resource_type
        if connector_requirements
        else None
    )
    resource_id_attr = (
        connector_requirements.resource_id_attr
        if connector_requirements
        else None
    )
    user = None
    workspace = None
    if is_custom:
        user = Client().active_user.id
        workspace = Client().active_workspace.id

    model_class = FlavorRequest if is_custom else InternalFlavorRequest
    model = model_class(
        user=user,
        workspace=workspace,
        name=self.name,
        type=self.type,
        source=source_utils.resolve(self.__class__).import_path,
        config_schema=self.config_schema,
        connector_type=connector_type,
        connector_resource_type=resource_type,
        connector_resource_id_attr=resource_id_attr,
        integration=integration,
        logo_url=self.logo_url,
        docs_url=self.docs_url,
        sdk_docs_url=self.sdk_docs_url,
        is_custom=is_custom,
    )
    return model

`Integration`

Base class for integration in ZenML.

Functions

`activate() -> None` `classmethod`

Abstract method to activate the integration.

Source code in src/zenml/integrations/integration.py

@classmethod
def activate(cls) -> None:
    """Abstract method to activate the integration."""

`check_installation() -> bool` `classmethod`

Method to check whether the required packages are installed.

Returns:

Type	Description
`bool`	True if all required packages are installed, False otherwise.

Source code in src/zenml/integrations/integration.py

@classmethod
def check_installation(cls) -> bool:
    """Method to check whether the required packages are installed.

    Returns:
        True if all required packages are installed, False otherwise.
    """
    for r in cls.get_requirements():
        try:
            # First check if the base package is installed
            dist = pkg_resources.get_distribution(r)

            # Next, check if the dependencies (including extras) are
            # installed
            deps: List[Requirement] = []

            _, extras = parse_requirement(r)
            if extras:
                extra_list = extras[1:-1].split(",")
                for extra in extra_list:
                    try:
                        requirements = dist.requires(extras=[extra])  # type: ignore[arg-type]
                    except pkg_resources.UnknownExtra as e:
                        logger.debug(f"Unknown extra: {str(e)}")
                        return False
                    deps.extend(requirements)
            else:
                deps = dist.requires()

            for ri in deps:
                try:
                    # Remove the "extra == ..." part from the requirement string
                    cleaned_req = re.sub(
                        r"; extra == \"\w+\"", "", str(ri)
                    )
                    pkg_resources.get_distribution(cleaned_req)
                except pkg_resources.DistributionNotFound as e:
                    logger.debug(
                        f"Unable to find required dependency "
                        f"'{e.req}' for requirement '{r}' "
                        f"necessary for integration '{cls.NAME}'."
                    )
                    return False
                except pkg_resources.VersionConflict as e:
                    logger.debug(
                        f"Package version '{e.dist}' does not match "
                        f"version '{e.req}' required by '{r}' "
                        f"necessary for integration '{cls.NAME}'."
                    )
                    return False

        except pkg_resources.DistributionNotFound as e:
            logger.debug(
                f"Unable to find required package '{e.req}' for "
                f"integration {cls.NAME}."
            )
            return False
        except pkg_resources.VersionConflict as e:
            logger.debug(
                f"Package version '{e.dist}' does not match version "
                f"'{e.req}' necessary for integration {cls.NAME}."
            )
            return False

    logger.debug(
        f"Integration {cls.NAME} is installed correctly with "
        f"requirements {cls.get_requirements()}."
    )
    return True

`flavors() -> List[Type[Flavor]]` `classmethod`

Abstract method to declare new stack component flavors.

Returns:

Type	Description
`List[Type[Flavor]]`	A list of new stack component flavors.

Source code in src/zenml/integrations/integration.py

@classmethod
def flavors(cls) -> List[Type[Flavor]]:
    """Abstract method to declare new stack component flavors.

    Returns:
        A list of new stack component flavors.
    """
    return []

`get_requirements(target_os: Optional[str] = None) -> List[str]` `classmethod`

Method to get the requirements for the integration.

Parameters:

Name	Type	Description	Default
`target_os`	`Optional[str]`	The target operating system to get the requirements for.	`None`

Returns:

Type	Description
`List[str]`	A list of requirements.

Source code in src/zenml/integrations/integration.py

@classmethod
def get_requirements(cls, target_os: Optional[str] = None) -> List[str]:
    """Method to get the requirements for the integration.

    Args:
        target_os: The target operating system to get the requirements for.

    Returns:
        A list of requirements.
    """
    return cls.REQUIREMENTS

`get_uninstall_requirements(target_os: Optional[str] = None) -> List[str]` `classmethod`

Method to get the uninstall requirements for the integration.

Parameters:

Name	Type	Description	Default
`target_os`	`Optional[str]`	The target operating system to get the requirements for.	`None`

Returns:

Type	Description
`List[str]`	A list of requirements.

Source code in src/zenml/integrations/integration.py

@classmethod
def get_uninstall_requirements(
    cls, target_os: Optional[str] = None
) -> List[str]:
    """Method to get the uninstall requirements for the integration.

    Args:
        target_os: The target operating system to get the requirements for.

    Returns:
        A list of requirements.
    """
    ret = []
    for each in cls.get_requirements(target_os=target_os):
        is_ignored = False
        for ignored in cls.REQUIREMENTS_IGNORED_ON_UNINSTALL:
            if each.startswith(ignored):
                is_ignored = True
                break
        if not is_ignored:
            ret.append(each)
    return ret

`plugin_flavors() -> List[Type[BasePluginFlavor]]` `classmethod`

Abstract method to declare new plugin flavors.

Returns:

Type	Description
`List[Type[BasePluginFlavor]]`	A list of new plugin flavors.

Source code in src/zenml/integrations/integration.py

@classmethod
def plugin_flavors(cls) -> List[Type["BasePluginFlavor"]]:
    """Abstract method to declare new plugin flavors.

    Returns:
        A list of new plugin flavors.
    """
    return []

`LabelStudioIntegration`

Bases: Integration

Definition of Label Studio integration for ZenML.

Functions

`flavors() -> List[Type[Flavor]]` `classmethod`

Declare the stack component flavors for the Label Studio integration.

Returns:

Type	Description
`List[Type[Flavor]]`	List of stack component flavors for this integration.

Source code in src/zenml/integrations/label_studio/__init__.py

@classmethod
def flavors(cls) -> List[Type[Flavor]]:
    """Declare the stack component flavors for the Label Studio integration.

    Returns:
        List of stack component flavors for this integration.
    """
    from zenml.integrations.label_studio.flavors import (
        LabelStudioAnnotatorFlavor,
    )

    return [LabelStudioAnnotatorFlavor]

Modules

`annotators`

Initialization of the Label Studio annotators submodule.

Classes

`LabelStudioAnnotator(name: str, id: UUID, config: StackComponentConfig, flavor: str, type: StackComponentType, user: Optional[UUID], workspace: UUID, created: datetime, updated: datetime, labels: Optional[Dict[str, Any]] = None, connector_requirements: Optional[ServiceConnectorRequirements] = None, connector: Optional[UUID] = None, connector_resource_id: Optional[str] = None, *args: Any, **kwargs: Any)`

Bases: BaseAnnotator, AuthenticationMixin

Class to interact with the Label Studio annotation interface.

Source code in src/zenml/stack/stack_component.py

def __init__(
    self,
    name: str,
    id: UUID,
    config: StackComponentConfig,
    flavor: str,
    type: StackComponentType,
    user: Optional[UUID],
    workspace: UUID,
    created: datetime,
    updated: datetime,
    labels: Optional[Dict[str, Any]] = None,
    connector_requirements: Optional[ServiceConnectorRequirements] = None,
    connector: Optional[UUID] = None,
    connector_resource_id: Optional[str] = None,
    *args: Any,
    **kwargs: Any,
):
    """Initializes a StackComponent.

    Args:
        name: The name of the component.
        id: The unique ID of the component.
        config: The config of the component.
        flavor: The flavor of the component.
        type: The type of the component.
        user: The ID of the user who created the component.
        workspace: The ID of the workspace the component belongs to.
        created: The creation time of the component.
        updated: The last update time of the component.
        labels: The labels of the component.
        connector_requirements: The requirements for the connector.
        connector: The ID of a connector linked to the component.
        connector_resource_id: The custom resource ID to access through
            the connector.
        *args: Additional positional arguments.
        **kwargs: Additional keyword arguments.

    Raises:
        ValueError: If a secret reference is passed as name.
    """
    if secret_utils.is_secret_reference(name):
        raise ValueError(
            "Passing the `name` attribute of a stack component as a "
            "secret reference is not allowed."
        )

    self.id = id
    self.name = name
    self._config = config
    self.flavor = flavor
    self.type = type
    self.user = user
    self.workspace = workspace
    self.created = created
    self.updated = updated
    self.labels = labels
    self.connector_requirements = connector_requirements
    self.connector = connector
    self.connector_resource_id = connector_resource_id
    self._connector_instance: Optional[ServiceConnector] = None

Attributes

config: LabelStudioAnnotatorConfig property

Returns the LabelStudioAnnotatorConfig config.

Returns:

Type	Description
`LabelStudioAnnotatorConfig`	The configuration.

settings_class: Type[LabelStudioAnnotatorSettings] property

Settings class for the Label Studio annotator.

Returns:

Type	Description
`Type[LabelStudioAnnotatorSettings]`	The settings class.

Functions

add_dataset(**kwargs: Any) -> Any

Registers a dataset for annotation.

Parameters:

Name	Type	Description	Default
`**kwargs`	`Any`	Additional keyword arguments to pass to the Label Studio client.	`{}`

Returns:

Type	Description
`Any`	A Label Studio Project object.

Raises:

Type	Description
`ValueError`	if 'dataset_name' and 'label_config' aren't provided.

Source code in src/zenml/integrations/label_studio/annotators/label_studio_annotator.py

def add_dataset(self, **kwargs: Any) -> Any:
    """Registers a dataset for annotation.

    Args:
        **kwargs: Additional keyword arguments to pass to the Label Studio client.

    Returns:
        A Label Studio Project object.

    Raises:
        ValueError: if 'dataset_name' and 'label_config' aren't provided.
    """
    dataset_name = kwargs.get("dataset_name")
    label_config = kwargs.get("label_config")
    if not dataset_name:
        raise ValueError("`dataset_name` keyword argument is required.")
    elif not label_config:
        raise ValueError("`label_config` keyword argument is required.")

    return self._get_client().start_project(
        title=dataset_name,
        label_config=label_config,
    )

connect_and_sync_external_storage(uri: str, params: LabelStudioDatasetSyncParameters, dataset: Project) -> Optional[Dict[str, Any]]

Syncs the external storage for the given project.

Parameters:

Name	Type	Description	Default
`uri`	`str`	URI of the storage source.	required
`params`	`LabelStudioDatasetSyncParameters`	Parameters for the dataset.	required
`dataset`	`Project`	Label Studio dataset.	required

Returns:

Type	Description
`Optional[Dict[str, Any]]`	A dictionary containing the sync result.

Raises:

Type	Description
`ValueError`	If the storage type is not supported.

Source code in src/zenml/integrations/label_studio/annotators/label_studio_annotator.py

def connect_and_sync_external_storage(
    self,
    uri: str,
    params: LabelStudioDatasetSyncParameters,
    dataset: Project,
) -> Optional[Dict[str, Any]]:
    """Syncs the external storage for the given project.

    Args:
        uri: URI of the storage source.
        params: Parameters for the dataset.
        dataset: Label Studio dataset.

    Returns:
        A dictionary containing the sync result.

    Raises:
        ValueError: If the storage type is not supported.
    """
    # TODO: check if proposed storage source has differing / new data
    # if self._storage_source_already_exists(uri, config, dataset):
    #     return None

    storage_connection_args = {
        "prefix": params.prefix,
        "regex_filter": params.regex_filter,
        "use_blob_urls": params.use_blob_urls,
        "presign": params.presign,
        "presign_ttl": params.presign_ttl,
        "title": dataset.get_params()["title"],
        "description": params.description,
    }
    if params.storage_type == "azure":
        if not params.azure_account_name or not params.azure_account_key:
            logger.warning(
                "Authentication credentials for Azure aren't fully "
                "provided. Please update the storage synchronization "
                "settings in the Label Studio web UI as per your needs."
            )
        storage = dataset.connect_azure_import_storage(
            container=uri,
            account_name=params.azure_account_name,
            account_key=params.azure_account_key,
            **storage_connection_args,
        )
    elif params.storage_type == "gcs":
        if not params.google_application_credentials:
            logger.warning(
                "Authentication credentials for Google Cloud Storage "
                "aren't fully provided. Please update the storage "
                "synchronization settings in the Label Studio web UI as "
                "per your needs."
            )
        storage = dataset.connect_google_import_storage(
            bucket=uri,
            google_application_credentials=params.google_application_credentials,
            **storage_connection_args,
        )
    elif params.storage_type == "s3":
        if (
            not params.aws_access_key_id
            or not params.aws_secret_access_key
        ):
            logger.warning(
                "Authentication credentials for S3 aren't fully provided."
                "Please update the storage synchronization settings in the "
                "Label Studio web UI as per your needs."
            )

        # temporary fix using client method until LS supports
        # recursive_scan in their SDK
        # (https://github.com/heartexlabs/label-studio-sdk/pull/130)
        ls_client = self._get_client()
        payload = {
            "bucket": uri,
            "prefix": params.prefix,
            "regex_filter": params.regex_filter,
            "use_blob_urls": params.use_blob_urls,
            "aws_access_key_id": params.aws_access_key_id,
            "aws_secret_access_key": params.aws_secret_access_key,
            "aws_session_token": params.aws_session_token,
            "region_name": params.s3_region_name,
            "s3_endpoint": params.s3_endpoint,
            "presign": params.presign,
            "presign_ttl": params.presign_ttl,
            "title": dataset.get_params()["title"],
            "description": params.description,
            "project": dataset.id,
            "recursive_scan": True,
        }
        response = ls_client.make_request(
            "POST", "/api/storages/s3", json=payload
        )
        storage = response.json()

    elif params.storage_type == "local":
        if not params.prefix:
            raise ValueError(
                "The 'prefix' parameter is required for local storage "
                "synchronization."
            )

        # Drop arguments that are not used by the local storage
        storage_connection_args.pop("presign")
        storage_connection_args.pop("presign_ttl")
        storage_connection_args.pop("prefix")

        prefix = params.prefix
        if not prefix.startswith("/"):
            prefix = f"/{prefix}"
        root_path = Path(prefix).parent

        # Set the environment variables required by Label Studio
        # to allow local file serving (see https://labelstud.io/guide/storage.html#Prerequisites-2)
        os.environ["LABEL_STUDIO_LOCAL_FILES_SERVING_ENABLED"] = "true"
        os.environ["LABEL_STUDIO_LOCAL_FILES_DOCUMENT_ROOT"] = str(
            root_path
        )

        storage = dataset.connect_local_import_storage(
            local_store_path=prefix,
            **storage_connection_args,
        )

        del os.environ["LABEL_STUDIO_LOCAL_FILES_SERVING_ENABLED"]
        del os.environ["LABEL_STUDIO_LOCAL_FILES_DOCUMENT_ROOT"]
    else:
        raise ValueError(
            f"Invalid storage type. '{params.storage_type}' is not "
            "supported by ZenML's Label Studio integration. Please choose "
            "between 'azure', 'gcs', 'aws' or 'local'."
        )

    synced_storage = self._get_client().sync_storage(
        storage_id=storage["id"], storage_type=storage["type"]
    )
    return cast(Dict[str, Any], synced_storage)

delete_dataset(**kwargs: Any) -> None

Deletes a dataset from the annotation interface.

Parameters:

Name	Type	Description	Default
`**kwargs`	`Any`	Additional keyword arguments to pass to the Label Studio client.	`{}`

Raises:

Type	Description
`ValueError`	If the dataset name is not provided or if the dataset does not exist.

Source code in src/zenml/integrations/label_studio/annotators/label_studio_annotator.py

def delete_dataset(self, **kwargs: Any) -> None:
    """Deletes a dataset from the annotation interface.

    Args:
        **kwargs: Additional keyword arguments to pass to the Label Studio
            client.

    Raises:
        ValueError: If the dataset name is not provided or if the dataset
            does not exist.
    """
    ls = self._get_client()
    dataset_name = kwargs.get("dataset_name")
    if not dataset_name:
        raise ValueError("`dataset_name` keyword argument is required.")

    dataset_id = self.get_id_from_name(dataset_name)
    if not dataset_id:
        raise ValueError(
            f"Dataset name '{dataset_name}' has no corresponding `dataset_id` in Label Studio."
        )
    ls.delete_project(dataset_id)

get_converted_dataset(dataset_name: str, output_format: str) -> Dict[Any, Any]

Extract annotated tasks in a specific converted format.

Parameters:

Name	Type	Description	Default
`dataset_name`	`str`	Id of the dataset.	required
`output_format`	`str`	Output format.	required

Returns:

Type	Description
`Dict[Any, Any]`	A dictionary containing the converted dataset.

Source code in src/zenml/integrations/label_studio/annotators/label_studio_annotator.py

def get_converted_dataset(
    self, dataset_name: str, output_format: str
) -> Dict[Any, Any]:
    """Extract annotated tasks in a specific converted format.

    Args:
        dataset_name: Id of the dataset.
        output_format: Output format.

    Returns:
        A dictionary containing the converted dataset.
    """
    project = self.get_dataset(dataset_name=dataset_name)
    return project.export_tasks(export_type=output_format)  # type: ignore[no-any-return]

get_dataset(**kwargs: Any) -> Any

Gets the dataset with the given name.

Parameters:

Name	Type	Description	Default
`**kwargs`	`Any`	Additional keyword arguments to pass to the Label Studio client.	`{}`

Returns:

Type	Description
`Any`	The LabelStudio Dataset object (a 'Project') for the given name.

Raises:

Type	Description
`ValueError`	If the dataset name is not provided or if the dataset does not exist.

Source code in src/zenml/integrations/label_studio/annotators/label_studio_annotator.py

def get_dataset(self, **kwargs: Any) -> Any:
    """Gets the dataset with the given name.

    Args:
        **kwargs: Additional keyword arguments to pass to the Label Studio client.

    Returns:
        The LabelStudio Dataset object (a 'Project') for the given name.

    Raises:
        ValueError: If the dataset name is not provided or if the dataset
            does not exist.
    """
    # TODO: check for and raise error if client unavailable
    dataset_name = kwargs.get("dataset_name")
    if not dataset_name:
        raise ValueError("`dataset_name` keyword argument is required.")

    dataset_id = self.get_id_from_name(dataset_name)
    if not dataset_id:
        raise ValueError(
            f"Dataset name '{dataset_name}' has no corresponding `dataset_id` in Label Studio."
        )
    return self._get_client().get_project(dataset_id)

get_dataset_names() -> List[str]

Gets the names of the datasets.

Returns:

Type	Description
`List[str]`	A list of dataset names.

Source code in src/zenml/integrations/label_studio/annotators/label_studio_annotator.py

def get_dataset_names(self) -> List[str]:
    """Gets the names of the datasets.

    Returns:
        A list of dataset names.
    """
    return [
        dataset.get_params()["title"] for dataset in self.get_datasets()
    ]

get_dataset_stats(dataset_name: str) -> Tuple[int, int]

Gets the statistics of the given dataset.

Parameters:

Name	Type	Description	Default
`dataset_name`	`str`	The name of the dataset.	required

Returns:

Type	Description
`Tuple[int, int]`	A tuple containing (labeled_task_count, unlabeled_task_count) for the dataset.

Raises:

Type	Description
`IndexError`	If the dataset does not exist.

Source code in src/zenml/integrations/label_studio/annotators/label_studio_annotator.py

def get_dataset_stats(self, dataset_name: str) -> Tuple[int, int]:
    """Gets the statistics of the given dataset.

    Args:
        dataset_name: The name of the dataset.

    Returns:
        A tuple containing (labeled_task_count, unlabeled_task_count) for
            the dataset.

    Raises:
        IndexError: If the dataset does not exist.
    """
    for project in self.get_datasets():
        if dataset_name in project.get_params()["title"]:
            labeled_task_count = len(project.get_labeled_tasks())
            unlabeled_task_count = len(project.get_unlabeled_tasks())
            return (labeled_task_count, unlabeled_task_count)
    raise IndexError(
        f"Dataset {dataset_name} not found. Please use "
        f"`zenml annotator dataset list` to list all available datasets."
    )

get_datasets() -> List[Any]

Gets the datasets currently available for annotation.

Returns:

Type	Description
`List[Any]`	A list of datasets.

Source code in src/zenml/integrations/label_studio/annotators/label_studio_annotator.py

def get_datasets(self) -> List[Any]:
    """Gets the datasets currently available for annotation.

    Returns:
        A list of datasets.
    """
    datasets = self._get_client().get_projects()
    return cast(List[Any], datasets)

get_id_from_name(dataset_name: str) -> Optional[int]

Gets the ID of the given dataset.

Parameters:

Name	Type	Description	Default
`dataset_name`	`str`	The name of the dataset.	required

Returns:

Type	Description
`Optional[int]`	The ID of the dataset.

Source code in src/zenml/integrations/label_studio/annotators/label_studio_annotator.py

def get_id_from_name(self, dataset_name: str) -> Optional[int]:
    """Gets the ID of the given dataset.

    Args:
        dataset_name: The name of the dataset.

    Returns:
        The ID of the dataset.
    """
    projects = self.get_datasets()
    for project in projects:
        if project.get_params()["title"] == dataset_name:
            return cast(int, project.get_params()["id"])
    return None

get_labeled_data(**kwargs: Any) -> Any

Gets the labeled data for the given dataset.

Parameters:

Name	Type	Description	Default
`**kwargs`	`Any`	Additional keyword arguments to pass to the Label Studio client.	`{}`

Returns:

Type	Description
`Any`	The labeled data.

Raises:

Type	Description
`ValueError`	If the dataset name is not provided or if the dataset does not exist.

Source code in src/zenml/integrations/label_studio/annotators/label_studio_annotator.py

def get_labeled_data(self, **kwargs: Any) -> Any:
    """Gets the labeled data for the given dataset.

    Args:
        **kwargs: Additional keyword arguments to pass to the Label Studio client.

    Returns:
        The labeled data.

    Raises:
        ValueError: If the dataset name is not provided or if the dataset
            does not exist.
    """
    dataset_name = kwargs.get("dataset_name")
    if not dataset_name:
        raise ValueError("`dataset_name` keyword argument is required.")

    dataset_id = self.get_id_from_name(dataset_name)
    if not dataset_id:
        raise ValueError(
            f"Dataset name '{dataset_name}' has no corresponding `dataset_id` in Label Studio."
        )
    return self._get_client().get_project(dataset_id).get_labeled_tasks()

get_parsed_label_config(dataset_id: int) -> Dict[str, Any]

Returns the parsed Label Studio label config for a dataset.

Parameters:

Name	Type	Description	Default
`dataset_id`	`int`	Id of the dataset.	required

Returns:

Type	Description
`Dict[str, Any]`	A dictionary containing the parsed label config.

Raises:

Type	Description
`ValueError`	If no dataset is found for the given id.

Source code in src/zenml/integrations/label_studio/annotators/label_studio_annotator.py

def get_parsed_label_config(self, dataset_id: int) -> Dict[str, Any]:
    """Returns the parsed Label Studio label config for a dataset.

    Args:
        dataset_id: Id of the dataset.

    Returns:
        A dictionary containing the parsed label config.

    Raises:
        ValueError: If no dataset is found for the given id.
    """
    # TODO: check if client actually is connected etc
    dataset = self._get_client().get_project(dataset_id)
    if dataset:
        return cast(Dict[str, Any], dataset.parsed_label_config)
    raise ValueError("No dataset found for the given id.")

get_unlabeled_data(**kwargs: str) -> Any

Gets the unlabeled data for the given dataset.

Parameters:

Name	Type	Description	Default
`**kwargs`	`str`	Additional keyword arguments to pass to the Label Studio client.	`{}`

Returns:

Type	Description
`Any`	The unlabeled data.

Raises:

Type	Description
`ValueError`	If the dataset name is not provided.

Source code in src/zenml/integrations/label_studio/annotators/label_studio_annotator.py

def get_unlabeled_data(self, **kwargs: str) -> Any:
    """Gets the unlabeled data for the given dataset.

    Args:
        **kwargs: Additional keyword arguments to pass to the Label Studio client.

    Returns:
        The unlabeled data.

    Raises:
        ValueError: If the dataset name is not provided.
    """
    dataset_name = kwargs.get("dataset_name")
    if not dataset_name:
        raise ValueError("`dataset_name` keyword argument is required.")

    dataset_id = self.get_id_from_name(dataset_name)
    if not dataset_id:
        raise ValueError(
            f"Dataset name '{dataset_name}' has no corresponding `dataset_id` in Label Studio."
        )
    return self._get_client().get_project(dataset_id).get_unlabeled_tasks()

get_url() -> str

Gets the top-level URL of the annotation interface.

Returns:

Type	Description
`str`	The URL of the annotation interface.

Source code in src/zenml/integrations/label_studio/annotators/label_studio_annotator.py

def get_url(self) -> str:
    """Gets the top-level URL of the annotation interface.

    Returns:
        The URL of the annotation interface.
    """
    return (
        f"{self.config.instance_url}:{self.config.port}"
        if self.config.port
        else self.config.instance_url
    )

get_url_for_dataset(dataset_name: str) -> str

Gets the URL of the annotation interface for the given dataset.

Parameters:

Name	Type	Description	Default
`dataset_name`	`str`	The name of the dataset.	required

Returns:

Type	Description
`str`	The URL of the annotation interface.

Source code in src/zenml/integrations/label_studio/annotators/label_studio_annotator.py

def get_url_for_dataset(self, dataset_name: str) -> str:
    """Gets the URL of the annotation interface for the given dataset.

    Args:
        dataset_name: The name of the dataset.

    Returns:
        The URL of the annotation interface.
    """
    project_id = self.get_id_from_name(dataset_name)
    return f"{self.get_url()}/projects/{project_id}/"

launch(**kwargs: Any) -> None

Launches the annotation interface.

Parameters:

Name	Type	Description	Default
`**kwargs`	`Any`	Additional keyword arguments to pass to the annotation client.	`{}`

Source code in src/zenml/integrations/label_studio/annotators/label_studio_annotator.py

def launch(self, **kwargs: Any) -> None:
    """Launches the annotation interface.

    Args:
        **kwargs: Additional keyword arguments to pass to the
            annotation client.
    """
    url = kwargs.get("url") or self.get_url()
    if self._connection_available():
        webbrowser.open(url, new=1, autoraise=True)
    else:
        logger.warning(
            "Could not launch annotation interface"
            "because the connection could not be established."
        )

populate_artifact_store_parameters(params: LabelStudioDatasetSyncParameters, artifact_store: BaseArtifactStore) -> None

Populate the dataset sync parameters with the artifact store credentials.

Parameters:

Name	Type	Description	Default
`params`	`LabelStudioDatasetSyncParameters`	The dataset sync parameters.	required
`artifact_store`	`BaseArtifactStore`	The active artifact store.	required

Raises:

Type	Description
`RuntimeError`	if the artifact store credentials cannot be fetched.

Source code in src/zenml/integrations/label_studio/annotators/label_studio_annotator.py

def populate_artifact_store_parameters(
    self,
    params: LabelStudioDatasetSyncParameters,
    artifact_store: BaseArtifactStore,
) -> None:
    """Populate the dataset sync parameters with the artifact store credentials.

    Args:
        params: The dataset sync parameters.
        artifact_store: The active artifact store.

    Raises:
        RuntimeError: if the artifact store credentials cannot be fetched.
    """
    if artifact_store.flavor == "s3":
        from zenml.integrations.s3.artifact_stores import S3ArtifactStore

        assert isinstance(artifact_store, S3ArtifactStore)

        params.storage_type = "s3"

        (
            aws_access_key_id,
            aws_secret_access_key,
            aws_session_token,
            _,
        ) = artifact_store.get_credentials()

        if aws_access_key_id and aws_secret_access_key:
            # Convert the credentials into the format expected by Label
            # Studio
            params.aws_access_key_id = aws_access_key_id
            params.aws_secret_access_key = aws_secret_access_key
            params.aws_session_token = aws_session_token

            if artifact_store.config.client_kwargs:
                if "endpoint_url" in artifact_store.config.client_kwargs:
                    params.s3_endpoint = (
                        artifact_store.config.client_kwargs["endpoint_url"]
                    )
                if "region_name" in artifact_store.config.client_kwargs:
                    params.s3_region_name = str(
                        artifact_store.config.client_kwargs["region_name"]
                    )

            return

        raise RuntimeError(
            "No credentials are configured for the active S3 artifact "
            "store. The Label Studio annotator needs explicit credentials "
            "to be configured for your artifact store to sync data "
            "artifacts."
        )

    elif artifact_store.flavor == "gcp":
        from zenml.integrations.gcp.artifact_stores import GCPArtifactStore

        assert isinstance(artifact_store, GCPArtifactStore)

        params.storage_type = "gcs"

        gcp_credentials = artifact_store.get_credentials()

        if gcp_credentials:
            # Save the credentials to a file in secure location, because
            # Label Studio will need to read it from a file
            secret_folder = Path(
                GlobalConfiguration().config_directory,
                "label-studio",
                str(self.id),
            )
            fileio.makedirs(str(secret_folder))
            file_path = Path(
                secret_folder, "google_application_credentials.json"
            )
            with os.fdopen(
                os.open(
                    file_path, flags=os.O_RDWR | os.O_CREAT, mode=0o600
                ),
                "w",
            ) as f:
                f.write(json.dumps(gcp_credentials))

            params.google_application_credentials = str(file_path)

            return

        raise RuntimeError(
            "No credentials are configured for the active GCS artifact "
            "store. The Label Studio annotator needs explicit credentials "
            "to be configured for your artifact store to sync data "
            "artifacts."
        )

    elif artifact_store.flavor == "azure":
        from zenml.integrations.azure.artifact_stores import (
            AzureArtifactStore,
        )

        assert isinstance(artifact_store, AzureArtifactStore)

        params.storage_type = "azure"

        azure_credentials = artifact_store.get_credentials()

        if azure_credentials:
            # Convert the credentials into the format expected by Label
            # Studio
            if azure_credentials.connection_string is not None:
                try:
                    # We need to extract the account name and key from the
                    # connection string
                    tokens = azure_credentials.connection_string.split(";")
                    token_dict = dict(
                        [token.split("=", maxsplit=1) for token in tokens]
                    )
                    params.azure_account_name = token_dict["AccountName"]
                    params.azure_account_key = token_dict["AccountKey"]
                except (KeyError, ValueError) as e:
                    raise RuntimeError(
                        "The Azure connection string configured for the "
                        "artifact store expected format."
                    ) from e

                return

            if (
                azure_credentials.account_name is not None
                and azure_credentials.account_key is not None
            ):
                params.azure_account_name = azure_credentials.account_name
                params.azure_account_key = azure_credentials.account_key

                return

            raise RuntimeError(
                "The Label Studio annotator could not use the "
                "credentials currently configured in the active Azure "
                "artifact store because it only supports Azure storage "
                "account credentials. "
                "Please use Azure storage account credentials for your "
                "artifact store."
            )

        raise RuntimeError(
            "No credentials are configured for the active Azure artifact "
            "store. The Label Studio annotator needs explicit credentials "
            "to be configured for your artifact store to sync data "
            "artifacts."
        )

    elif artifact_store.flavor == "local":
        from zenml.artifact_stores.local_artifact_store import (
            LocalArtifactStore,
        )

        assert isinstance(artifact_store, LocalArtifactStore)

        params.storage_type = "local"
        if params.prefix is None:
            params.prefix = artifact_store.path
        elif not params.prefix.startswith(artifact_store.path.lstrip("/")):
            raise RuntimeError(
                "The prefix for the local storage must be a subdirectory "
                "of the local artifact store path."
            )
        return

    raise RuntimeError(
        f"The active artifact store type '{artifact_store.flavor}' is not "
        "supported by ZenML's Label Studio integration. "
        "Please use one of the supported artifact stores (S3, GCP, "
        "Azure or local)."
    )

register_dataset_for_annotation(label_config: str, dataset_name: str) -> Any

Registers a dataset for annotation.

Parameters:

Name	Type	Description	Default
`label_config`	`str`	The label config to use for the annotation interface.	required
`dataset_name`	`str`	Name of the dataset to register.	required

Returns:

Type	Description
`Any`	A Label Studio Project object.

Source code in src/zenml/integrations/label_studio/annotators/label_studio_annotator.py

def register_dataset_for_annotation(
    self,
    label_config: str,
    dataset_name: str,
) -> Any:
    """Registers a dataset for annotation.

    Args:
        label_config: The label config to use for the annotation interface.
        dataset_name: Name of the dataset to register.

    Returns:
        A Label Studio Project object.
    """
    project_id = self.get_id_from_name(dataset_name)
    if project_id:
        dataset = self._get_client().get_project(project_id)
    else:
        dataset = self.add_dataset(
            dataset_name=dataset_name,
            label_config=label_config,
        )

    return dataset

Modules

`label_studio_annotator`

Implementation of the Label Studio annotation integration.

Classes

LabelStudioAnnotator(name: str, id: UUID, config: StackComponentConfig, flavor: str, type: StackComponentType, user: Optional[UUID], workspace: UUID, created: datetime, updated: datetime, labels: Optional[Dict[str, Any]] = None, connector_requirements: Optional[ServiceConnectorRequirements] = None, connector: Optional[UUID] = None, connector_resource_id: Optional[str] = None, *args: Any, **kwargs: Any)

Bases: BaseAnnotator, AuthenticationMixin

Class to interact with the Label Studio annotation interface.

Source code in src/zenml/stack/stack_component.py

def __init__(
    self,
    name: str,
    id: UUID,
    config: StackComponentConfig,
    flavor: str,
    type: StackComponentType,
    user: Optional[UUID],
    workspace: UUID,
    created: datetime,
    updated: datetime,
    labels: Optional[Dict[str, Any]] = None,
    connector_requirements: Optional[ServiceConnectorRequirements] = None,
    connector: Optional[UUID] = None,
    connector_resource_id: Optional[str] = None,
    *args: Any,
    **kwargs: Any,
):
    """Initializes a StackComponent.

    Args:
        name: The name of the component.
        id: The unique ID of the component.
        config: The config of the component.
        flavor: The flavor of the component.
        type: The type of the component.
        user: The ID of the user who created the component.
        workspace: The ID of the workspace the component belongs to.
        created: The creation time of the component.
        updated: The last update time of the component.
        labels: The labels of the component.
        connector_requirements: The requirements for the connector.
        connector: The ID of a connector linked to the component.
        connector_resource_id: The custom resource ID to access through
            the connector.
        *args: Additional positional arguments.
        **kwargs: Additional keyword arguments.

    Raises:
        ValueError: If a secret reference is passed as name.
    """
    if secret_utils.is_secret_reference(name):
        raise ValueError(
            "Passing the `name` attribute of a stack component as a "
            "secret reference is not allowed."
        )

    self.id = id
    self.name = name
    self._config = config
    self.flavor = flavor
    self.type = type
    self.user = user
    self.workspace = workspace
    self.created = created
    self.updated = updated
    self.labels = labels
    self.connector_requirements = connector_requirements
    self.connector = connector
    self.connector_resource_id = connector_resource_id
    self._connector_instance: Optional[ServiceConnector] = None

Attributes

config: LabelStudioAnnotatorConfig property

Returns the LabelStudioAnnotatorConfig config.

Returns:

Type	Description
`LabelStudioAnnotatorConfig`	The configuration.

settings_class: Type[LabelStudioAnnotatorSettings] property

Settings class for the Label Studio annotator.

Returns:

Type	Description
`Type[LabelStudioAnnotatorSettings]`	The settings class.

Functions

add_dataset(**kwargs: Any) -> Any

Registers a dataset for annotation.

Parameters:

Name	Type	Description	Default
`**kwargs`	`Any`	Additional keyword arguments to pass to the Label Studio client.	`{}`

Returns:

Type	Description
`Any`	A Label Studio Project object.

Raises:

Type	Description
`ValueError`	if 'dataset_name' and 'label_config' aren't provided.

Source code in src/zenml/integrations/label_studio/annotators/label_studio_annotator.py

def add_dataset(self, **kwargs: Any) -> Any:
    """Registers a dataset for annotation.

    Args:
        **kwargs: Additional keyword arguments to pass to the Label Studio client.

    Returns:
        A Label Studio Project object.

    Raises:
        ValueError: if 'dataset_name' and 'label_config' aren't provided.
    """
    dataset_name = kwargs.get("dataset_name")
    label_config = kwargs.get("label_config")
    if not dataset_name:
        raise ValueError("`dataset_name` keyword argument is required.")
    elif not label_config:
        raise ValueError("`label_config` keyword argument is required.")

    return self._get_client().start_project(
        title=dataset_name,
        label_config=label_config,
    )

connect_and_sync_external_storage(uri: str, params: LabelStudioDatasetSyncParameters, dataset: Project) -> Optional[Dict[str, Any]]

Syncs the external storage for the given project.

Parameters:

Name	Type	Description	Default
`uri`	`str`	URI of the storage source.	required
`params`	`LabelStudioDatasetSyncParameters`	Parameters for the dataset.	required
`dataset`	`Project`	Label Studio dataset.	required

Returns:

Type	Description
`Optional[Dict[str, Any]]`	A dictionary containing the sync result.

Raises:

Type	Description
`ValueError`	If the storage type is not supported.

Source code in src/zenml/integrations/label_studio/annotators/label_studio_annotator.py

def connect_and_sync_external_storage(
    self,
    uri: str,
    params: LabelStudioDatasetSyncParameters,
    dataset: Project,
) -> Optional[Dict[str, Any]]:
    """Syncs the external storage for the given project.

    Args:
        uri: URI of the storage source.
        params: Parameters for the dataset.
        dataset: Label Studio dataset.

    Returns:
        A dictionary containing the sync result.

    Raises:
        ValueError: If the storage type is not supported.
    """
    # TODO: check if proposed storage source has differing / new data
    # if self._storage_source_already_exists(uri, config, dataset):
    #     return None

    storage_connection_args = {
        "prefix": params.prefix,
        "regex_filter": params.regex_filter,
        "use_blob_urls": params.use_blob_urls,
        "presign": params.presign,
        "presign_ttl": params.presign_ttl,
        "title": dataset.get_params()["title"],
        "description": params.description,
    }
    if params.storage_type == "azure":
        if not params.azure_account_name or not params.azure_account_key:
            logger.warning(
                "Authentication credentials for Azure aren't fully "
                "provided. Please update the storage synchronization "
                "settings in the Label Studio web UI as per your needs."
            )
        storage = dataset.connect_azure_import_storage(
            container=uri,
            account_name=params.azure_account_name,
            account_key=params.azure_account_key,
            **storage_connection_args,
        )
    elif params.storage_type == "gcs":
        if not params.google_application_credentials:
            logger.warning(
                "Authentication credentials for Google Cloud Storage "
                "aren't fully provided. Please update the storage "
                "synchronization settings in the Label Studio web UI as "
                "per your needs."
            )
        storage = dataset.connect_google_import_storage(
            bucket=uri,
            google_application_credentials=params.google_application_credentials,
            **storage_connection_args,
        )
    elif params.storage_type == "s3":
        if (
            not params.aws_access_key_id
            or not params.aws_secret_access_key
        ):
            logger.warning(
                "Authentication credentials for S3 aren't fully provided."
                "Please update the storage synchronization settings in the "
                "Label Studio web UI as per your needs."
            )

        # temporary fix using client method until LS supports
        # recursive_scan in their SDK
        # (https://github.com/heartexlabs/label-studio-sdk/pull/130)
        ls_client = self._get_client()
        payload = {
            "bucket": uri,
            "prefix": params.prefix,
            "regex_filter": params.regex_filter,
            "use_blob_urls": params.use_blob_urls,
            "aws_access_key_id": params.aws_access_key_id,
            "aws_secret_access_key": params.aws_secret_access_key,
            "aws_session_token": params.aws_session_token,
            "region_name": params.s3_region_name,
            "s3_endpoint": params.s3_endpoint,
            "presign": params.presign,
            "presign_ttl": params.presign_ttl,
            "title": dataset.get_params()["title"],
            "description": params.description,
            "project": dataset.id,
            "recursive_scan": True,
        }
        response = ls_client.make_request(
            "POST", "/api/storages/s3", json=payload
        )
        storage = response.json()

    elif params.storage_type == "local":
        if not params.prefix:
            raise ValueError(
                "The 'prefix' parameter is required for local storage "
                "synchronization."
            )

        # Drop arguments that are not used by the local storage
        storage_connection_args.pop("presign")
        storage_connection_args.pop("presign_ttl")
        storage_connection_args.pop("prefix")

        prefix = params.prefix
        if not prefix.startswith("/"):
            prefix = f"/{prefix}"
        root_path = Path(prefix).parent

        # Set the environment variables required by Label Studio
        # to allow local file serving (see https://labelstud.io/guide/storage.html#Prerequisites-2)
        os.environ["LABEL_STUDIO_LOCAL_FILES_SERVING_ENABLED"] = "true"
        os.environ["LABEL_STUDIO_LOCAL_FILES_DOCUMENT_ROOT"] = str(
            root_path
        )

        storage = dataset.connect_local_import_storage(
            local_store_path=prefix,
            **storage_connection_args,
        )

        del os.environ["LABEL_STUDIO_LOCAL_FILES_SERVING_ENABLED"]
        del os.environ["LABEL_STUDIO_LOCAL_FILES_DOCUMENT_ROOT"]
    else:
        raise ValueError(
            f"Invalid storage type. '{params.storage_type}' is not "
            "supported by ZenML's Label Studio integration. Please choose "
            "between 'azure', 'gcs', 'aws' or 'local'."
        )

    synced_storage = self._get_client().sync_storage(
        storage_id=storage["id"], storage_type=storage["type"]
    )
    return cast(Dict[str, Any], synced_storage)

delete_dataset(**kwargs: Any) -> None

Deletes a dataset from the annotation interface.

Parameters:

Name	Type	Description	Default
`**kwargs`	`Any`	Additional keyword arguments to pass to the Label Studio client.	`{}`

Raises:

Type	Description
`ValueError`	If the dataset name is not provided or if the dataset does not exist.

Source code in src/zenml/integrations/label_studio/annotators/label_studio_annotator.py

def delete_dataset(self, **kwargs: Any) -> None:
    """Deletes a dataset from the annotation interface.

    Args:
        **kwargs: Additional keyword arguments to pass to the Label Studio
            client.

    Raises:
        ValueError: If the dataset name is not provided or if the dataset
            does not exist.
    """
    ls = self._get_client()
    dataset_name = kwargs.get("dataset_name")
    if not dataset_name:
        raise ValueError("`dataset_name` keyword argument is required.")

    dataset_id = self.get_id_from_name(dataset_name)
    if not dataset_id:
        raise ValueError(
            f"Dataset name '{dataset_name}' has no corresponding `dataset_id` in Label Studio."
        )
    ls.delete_project(dataset_id)

get_converted_dataset(dataset_name: str, output_format: str) -> Dict[Any, Any]

Extract annotated tasks in a specific converted format.

Parameters:

Name	Type	Description	Default
`dataset_name`	`str`	Id of the dataset.	required
`output_format`	`str`	Output format.	required

Returns:

Type	Description
`Dict[Any, Any]`	A dictionary containing the converted dataset.

Source code in src/zenml/integrations/label_studio/annotators/label_studio_annotator.py

def get_converted_dataset(
    self, dataset_name: str, output_format: str
) -> Dict[Any, Any]:
    """Extract annotated tasks in a specific converted format.

    Args:
        dataset_name: Id of the dataset.
        output_format: Output format.

    Returns:
        A dictionary containing the converted dataset.
    """
    project = self.get_dataset(dataset_name=dataset_name)
    return project.export_tasks(export_type=output_format)  # type: ignore[no-any-return]

get_dataset(**kwargs: Any) -> Any

Gets the dataset with the given name.

Parameters:

Name	Type	Description	Default
`**kwargs`	`Any`	Additional keyword arguments to pass to the Label Studio client.	`{}`

Returns:

Type	Description
`Any`	The LabelStudio Dataset object (a 'Project') for the given name.

Raises:

Type	Description
`ValueError`	If the dataset name is not provided or if the dataset does not exist.

Source code in src/zenml/integrations/label_studio/annotators/label_studio_annotator.py

def get_dataset(self, **kwargs: Any) -> Any:
    """Gets the dataset with the given name.

    Args:
        **kwargs: Additional keyword arguments to pass to the Label Studio client.

    Returns:
        The LabelStudio Dataset object (a 'Project') for the given name.

    Raises:
        ValueError: If the dataset name is not provided or if the dataset
            does not exist.
    """
    # TODO: check for and raise error if client unavailable
    dataset_name = kwargs.get("dataset_name")
    if not dataset_name:
        raise ValueError("`dataset_name` keyword argument is required.")

    dataset_id = self.get_id_from_name(dataset_name)
    if not dataset_id:
        raise ValueError(
            f"Dataset name '{dataset_name}' has no corresponding `dataset_id` in Label Studio."
        )
    return self._get_client().get_project(dataset_id)

get_dataset_names() -> List[str]

Gets the names of the datasets.

Returns:

Type	Description
`List[str]`	A list of dataset names.

Source code in src/zenml/integrations/label_studio/annotators/label_studio_annotator.py

def get_dataset_names(self) -> List[str]:
    """Gets the names of the datasets.

    Returns:
        A list of dataset names.
    """
    return [
        dataset.get_params()["title"] for dataset in self.get_datasets()
    ]

get_dataset_stats(dataset_name: str) -> Tuple[int, int]

Gets the statistics of the given dataset.

Parameters:

Name	Type	Description	Default
`dataset_name`	`str`	The name of the dataset.	required

Returns:

Type	Description
`Tuple[int, int]`	A tuple containing (labeled_task_count, unlabeled_task_count) for the dataset.

Raises:

Type	Description
`IndexError`	If the dataset does not exist.

Source code in src/zenml/integrations/label_studio/annotators/label_studio_annotator.py

def get_dataset_stats(self, dataset_name: str) -> Tuple[int, int]:
    """Gets the statistics of the given dataset.

    Args:
        dataset_name: The name of the dataset.

    Returns:
        A tuple containing (labeled_task_count, unlabeled_task_count) for
            the dataset.

    Raises:
        IndexError: If the dataset does not exist.
    """
    for project in self.get_datasets():
        if dataset_name in project.get_params()["title"]:
            labeled_task_count = len(project.get_labeled_tasks())
            unlabeled_task_count = len(project.get_unlabeled_tasks())
            return (labeled_task_count, unlabeled_task_count)
    raise IndexError(
        f"Dataset {dataset_name} not found. Please use "
        f"`zenml annotator dataset list` to list all available datasets."
    )

get_datasets() -> List[Any]

Gets the datasets currently available for annotation.

Returns:

Type	Description
`List[Any]`	A list of datasets.

Source code in src/zenml/integrations/label_studio/annotators/label_studio_annotator.py

def get_datasets(self) -> List[Any]:
    """Gets the datasets currently available for annotation.

    Returns:
        A list of datasets.
    """
    datasets = self._get_client().get_projects()
    return cast(List[Any], datasets)

get_id_from_name(dataset_name: str) -> Optional[int]

Gets the ID of the given dataset.

Parameters:

Name	Type	Description	Default
`dataset_name`	`str`	The name of the dataset.	required

Returns:

Type	Description
`Optional[int]`	The ID of the dataset.

Source code in src/zenml/integrations/label_studio/annotators/label_studio_annotator.py

def get_id_from_name(self, dataset_name: str) -> Optional[int]:
    """Gets the ID of the given dataset.

    Args:
        dataset_name: The name of the dataset.

    Returns:
        The ID of the dataset.
    """
    projects = self.get_datasets()
    for project in projects:
        if project.get_params()["title"] == dataset_name:
            return cast(int, project.get_params()["id"])
    return None

get_labeled_data(**kwargs: Any) -> Any

Gets the labeled data for the given dataset.

Parameters:

Name	Type	Description	Default
`**kwargs`	`Any`	Additional keyword arguments to pass to the Label Studio client.	`{}`

Returns:

Type	Description
`Any`	The labeled data.

Raises:

Type	Description
`ValueError`	If the dataset name is not provided or if the dataset does not exist.

Source code in src/zenml/integrations/label_studio/annotators/label_studio_annotator.py

def get_labeled_data(self, **kwargs: Any) -> Any:
    """Gets the labeled data for the given dataset.

    Args:
        **kwargs: Additional keyword arguments to pass to the Label Studio client.

    Returns:
        The labeled data.

    Raises:
        ValueError: If the dataset name is not provided or if the dataset
            does not exist.
    """
    dataset_name = kwargs.get("dataset_name")
    if not dataset_name:
        raise ValueError("`dataset_name` keyword argument is required.")

    dataset_id = self.get_id_from_name(dataset_name)
    if not dataset_id:
        raise ValueError(
            f"Dataset name '{dataset_name}' has no corresponding `dataset_id` in Label Studio."
        )
    return self._get_client().get_project(dataset_id).get_labeled_tasks()

get_parsed_label_config(dataset_id: int) -> Dict[str, Any]

Returns the parsed Label Studio label config for a dataset.

Parameters:

Name	Type	Description	Default
`dataset_id`	`int`	Id of the dataset.	required

Returns:

Type	Description
`Dict[str, Any]`	A dictionary containing the parsed label config.

Raises:

Type	Description
`ValueError`	If no dataset is found for the given id.

Source code in src/zenml/integrations/label_studio/annotators/label_studio_annotator.py

def get_parsed_label_config(self, dataset_id: int) -> Dict[str, Any]:
    """Returns the parsed Label Studio label config for a dataset.

    Args:
        dataset_id: Id of the dataset.

    Returns:
        A dictionary containing the parsed label config.

    Raises:
        ValueError: If no dataset is found for the given id.
    """
    # TODO: check if client actually is connected etc
    dataset = self._get_client().get_project(dataset_id)
    if dataset:
        return cast(Dict[str, Any], dataset.parsed_label_config)
    raise ValueError("No dataset found for the given id.")

get_unlabeled_data(**kwargs: str) -> Any

Gets the unlabeled data for the given dataset.

Parameters:

Name	Type	Description	Default
`**kwargs`	`str`	Additional keyword arguments to pass to the Label Studio client.	`{}`

Returns:

Type	Description
`Any`	The unlabeled data.

Raises:

Type	Description
`ValueError`	If the dataset name is not provided.

Source code in src/zenml/integrations/label_studio/annotators/label_studio_annotator.py

def get_unlabeled_data(self, **kwargs: str) -> Any:
    """Gets the unlabeled data for the given dataset.

    Args:
        **kwargs: Additional keyword arguments to pass to the Label Studio client.

    Returns:
        The unlabeled data.

    Raises:
        ValueError: If the dataset name is not provided.
    """
    dataset_name = kwargs.get("dataset_name")
    if not dataset_name:
        raise ValueError("`dataset_name` keyword argument is required.")

    dataset_id = self.get_id_from_name(dataset_name)
    if not dataset_id:
        raise ValueError(
            f"Dataset name '{dataset_name}' has no corresponding `dataset_id` in Label Studio."
        )
    return self._get_client().get_project(dataset_id).get_unlabeled_tasks()

get_url() -> str

Gets the top-level URL of the annotation interface.

Returns:

Type	Description
`str`	The URL of the annotation interface.

Source code in src/zenml/integrations/label_studio/annotators/label_studio_annotator.py

def get_url(self) -> str:
    """Gets the top-level URL of the annotation interface.

    Returns:
        The URL of the annotation interface.
    """
    return (
        f"{self.config.instance_url}:{self.config.port}"
        if self.config.port
        else self.config.instance_url
    )

get_url_for_dataset(dataset_name: str) -> str

Gets the URL of the annotation interface for the given dataset.

Parameters:

Name	Type	Description	Default
`dataset_name`	`str`	The name of the dataset.	required

Returns:

Type	Description
`str`	The URL of the annotation interface.

Source code in src/zenml/integrations/label_studio/annotators/label_studio_annotator.py

def get_url_for_dataset(self, dataset_name: str) -> str:
    """Gets the URL of the annotation interface for the given dataset.

    Args:
        dataset_name: The name of the dataset.

    Returns:
        The URL of the annotation interface.
    """
    project_id = self.get_id_from_name(dataset_name)
    return f"{self.get_url()}/projects/{project_id}/"

launch(**kwargs: Any) -> None

Launches the annotation interface.

Parameters:

Name	Type	Description	Default
`**kwargs`	`Any`	Additional keyword arguments to pass to the annotation client.	`{}`

Source code in src/zenml/integrations/label_studio/annotators/label_studio_annotator.py

def launch(self, **kwargs: Any) -> None:
    """Launches the annotation interface.

    Args:
        **kwargs: Additional keyword arguments to pass to the
            annotation client.
    """
    url = kwargs.get("url") or self.get_url()
    if self._connection_available():
        webbrowser.open(url, new=1, autoraise=True)
    else:
        logger.warning(
            "Could not launch annotation interface"
            "because the connection could not be established."
        )

populate_artifact_store_parameters(params: LabelStudioDatasetSyncParameters, artifact_store: BaseArtifactStore) -> None

Populate the dataset sync parameters with the artifact store credentials.

Parameters:

Name	Type	Description	Default
`params`	`LabelStudioDatasetSyncParameters`	The dataset sync parameters.	required
`artifact_store`	`BaseArtifactStore`	The active artifact store.	required

Raises:

Type	Description
`RuntimeError`	if the artifact store credentials cannot be fetched.

Source code in src/zenml/integrations/label_studio/annotators/label_studio_annotator.py

def populate_artifact_store_parameters(
    self,
    params: LabelStudioDatasetSyncParameters,
    artifact_store: BaseArtifactStore,
) -> None:
    """Populate the dataset sync parameters with the artifact store credentials.

    Args:
        params: The dataset sync parameters.
        artifact_store: The active artifact store.

    Raises:
        RuntimeError: if the artifact store credentials cannot be fetched.
    """
    if artifact_store.flavor == "s3":
        from zenml.integrations.s3.artifact_stores import S3ArtifactStore

        assert isinstance(artifact_store, S3ArtifactStore)

        params.storage_type = "s3"

        (
            aws_access_key_id,
            aws_secret_access_key,
            aws_session_token,
            _,
        ) = artifact_store.get_credentials()

        if aws_access_key_id and aws_secret_access_key:
            # Convert the credentials into the format expected by Label
            # Studio
            params.aws_access_key_id = aws_access_key_id
            params.aws_secret_access_key = aws_secret_access_key
            params.aws_session_token = aws_session_token

            if artifact_store.config.client_kwargs:
                if "endpoint_url" in artifact_store.config.client_kwargs:
                    params.s3_endpoint = (
                        artifact_store.config.client_kwargs["endpoint_url"]
                    )
                if "region_name" in artifact_store.config.client_kwargs:
                    params.s3_region_name = str(
                        artifact_store.config.client_kwargs["region_name"]
                    )

            return

        raise RuntimeError(
            "No credentials are configured for the active S3 artifact "
            "store. The Label Studio annotator needs explicit credentials "
            "to be configured for your artifact store to sync data "
            "artifacts."
        )

    elif artifact_store.flavor == "gcp":
        from zenml.integrations.gcp.artifact_stores import GCPArtifactStore

        assert isinstance(artifact_store, GCPArtifactStore)

        params.storage_type = "gcs"

        gcp_credentials = artifact_store.get_credentials()

        if gcp_credentials:
            # Save the credentials to a file in secure location, because
            # Label Studio will need to read it from a file
            secret_folder = Path(
                GlobalConfiguration().config_directory,
                "label-studio",
                str(self.id),
            )
            fileio.makedirs(str(secret_folder))
            file_path = Path(
                secret_folder, "google_application_credentials.json"
            )
            with os.fdopen(
                os.open(
                    file_path, flags=os.O_RDWR | os.O_CREAT, mode=0o600
                ),
                "w",
            ) as f:
                f.write(json.dumps(gcp_credentials))

            params.google_application_credentials = str(file_path)

            return

        raise RuntimeError(
            "No credentials are configured for the active GCS artifact "
            "store. The Label Studio annotator needs explicit credentials "
            "to be configured for your artifact store to sync data "
            "artifacts."
        )

    elif artifact_store.flavor == "azure":
        from zenml.integrations.azure.artifact_stores import (
            AzureArtifactStore,
        )

        assert isinstance(artifact_store, AzureArtifactStore)

        params.storage_type = "azure"

        azure_credentials = artifact_store.get_credentials()

        if azure_credentials:
            # Convert the credentials into the format expected by Label
            # Studio
            if azure_credentials.connection_string is not None:
                try:
                    # We need to extract the account name and key from the
                    # connection string
                    tokens = azure_credentials.connection_string.split(";")
                    token_dict = dict(
                        [token.split("=", maxsplit=1) for token in tokens]
                    )
                    params.azure_account_name = token_dict["AccountName"]
                    params.azure_account_key = token_dict["AccountKey"]
                except (KeyError, ValueError) as e:
                    raise RuntimeError(
                        "The Azure connection string configured for the "
                        "artifact store expected format."
                    ) from e

                return

            if (
                azure_credentials.account_name is not None
                and azure_credentials.account_key is not None
            ):
                params.azure_account_name = azure_credentials.account_name
                params.azure_account_key = azure_credentials.account_key

                return

            raise RuntimeError(
                "The Label Studio annotator could not use the "
                "credentials currently configured in the active Azure "
                "artifact store because it only supports Azure storage "
                "account credentials. "
                "Please use Azure storage account credentials for your "
                "artifact store."
            )

        raise RuntimeError(
            "No credentials are configured for the active Azure artifact "
            "store. The Label Studio annotator needs explicit credentials "
            "to be configured for your artifact store to sync data "
            "artifacts."
        )

    elif artifact_store.flavor == "local":
        from zenml.artifact_stores.local_artifact_store import (
            LocalArtifactStore,
        )

        assert isinstance(artifact_store, LocalArtifactStore)

        params.storage_type = "local"
        if params.prefix is None:
            params.prefix = artifact_store.path
        elif not params.prefix.startswith(artifact_store.path.lstrip("/")):
            raise RuntimeError(
                "The prefix for the local storage must be a subdirectory "
                "of the local artifact store path."
            )
        return

    raise RuntimeError(
        f"The active artifact store type '{artifact_store.flavor}' is not "
        "supported by ZenML's Label Studio integration. "
        "Please use one of the supported artifact stores (S3, GCP, "
        "Azure or local)."
    )

register_dataset_for_annotation(label_config: str, dataset_name: str) -> Any

Registers a dataset for annotation.

Parameters:

Name	Type	Description	Default
`label_config`	`str`	The label config to use for the annotation interface.	required
`dataset_name`	`str`	Name of the dataset to register.	required

Returns:

Type	Description
`Any`	A Label Studio Project object.

Source code in src/zenml/integrations/label_studio/annotators/label_studio_annotator.py

def register_dataset_for_annotation(
    self,
    label_config: str,
    dataset_name: str,
) -> Any:
    """Registers a dataset for annotation.

    Args:
        label_config: The label config to use for the annotation interface.
        dataset_name: Name of the dataset to register.

    Returns:
        A Label Studio Project object.
    """
    project_id = self.get_id_from_name(dataset_name)
    if project_id:
        dataset = self._get_client().get_project(project_id)
    else:
        dataset = self.add_dataset(
            dataset_name=dataset_name,
            label_config=label_config,
        )

    return dataset

Functions Modules

`flavors`

Label Studio integration flavors.

Classes

`LabelStudioAnnotatorConfig(warn_about_plain_text_secrets: bool = False, **kwargs: Any)`

Bases: BaseAnnotatorConfig, LabelStudioAnnotatorSettings, AuthenticationConfigMixin

Config for the Label Studio annotator.

This class combines settings and authentication configurations for Label Studio into a single, usable configuration object without adding additional functionality.

Source code in src/zenml/config/secret_reference_mixin.py

def __init__(
    self, warn_about_plain_text_secrets: bool = False, **kwargs: Any
) -> None:
    """Ensures that secret references are only passed for valid fields.

    This method ensures that secret references are not passed for fields
    that explicitly prevent them or require pydantic validation.

    Args:
        warn_about_plain_text_secrets: If true, then warns about using plain-text secrets.
        **kwargs: Arguments to initialize this object.

    Raises:
        ValueError: If an attribute that requires custom pydantic validation
            or an attribute which explicitly disallows secret references
            is passed as a secret reference.
    """
    for key, value in kwargs.items():
        try:
            field = self.__class__.model_fields[key]
        except KeyError:
            # Value for a private attribute or non-existing field, this
            # will fail during the upcoming pydantic validation
            continue

        if value is None:
            continue

        if not secret_utils.is_secret_reference(value):
            if (
                secret_utils.is_secret_field(field)
                and warn_about_plain_text_secrets
            ):
                logger.warning(
                    "You specified a plain-text value for the sensitive "
                    f"attribute `{key}`. This is currently only a warning, "
                    "but future versions of ZenML will require you to pass "
                    "in sensitive information as secrets. Check out the "
                    "documentation on how to configure values with secrets "
                    "here: https://docs.zenml.io/getting-started/deploying-zenml/secret-management"
                )
            continue

        if secret_utils.is_clear_text_field(field):
            raise ValueError(
                f"Passing the `{key}` attribute as a secret reference is "
                "not allowed."
            )

        requires_validation = has_validators(
            pydantic_class=self.__class__, field_name=key
        )
        if requires_validation:
            raise ValueError(
                f"Passing the attribute `{key}` as a secret reference is "
                "not allowed as additional validation is required for "
                "this attribute."
            )

    super().__init__(**kwargs)

`LabelStudioAnnotatorFlavor`

Bases: BaseAnnotatorFlavor

Label Studio annotator flavor.

Attributes

config_class: Type[LabelStudioAnnotatorConfig] property

Returns LabelStudioAnnotatorConfig config class.

Returns:

Type	Description
`Type[LabelStudioAnnotatorConfig]`	The config class.

docs_url: Optional[str] property

A url to point at docs explaining this flavor.

Returns:

Type	Description
`Optional[str]`	A flavor docs url.

implementation_class: Type[LabelStudioAnnotator] property

Implementation class for this flavor.

Returns:

Type	Description
`Type[LabelStudioAnnotator]`	The implementation class.

logo_url: str property

A url to represent the flavor in the dashboard.

Returns:

Type	Description
`str`	The flavor logo.

name: str property

Name of the flavor.

Returns:

Type	Description
`str`	The name of the flavor.

sdk_docs_url: Optional[str] property

A url to point at SDK docs explaining this flavor.

Returns:

Type	Description
`Optional[str]`	A flavor SDK docs url.

`LabelStudioAnnotatorSettings(warn_about_plain_text_secrets: bool = False, **kwargs: Any)`

Bases: BaseSettings

Label studio annotator settings.

Attributes:

Name	Type	Description
`instance_url`	`str`	URL of the Label Studio instance.
`port`	`Optional[int]`	The port to use for the annotation interface.
`api_key`	`Optional[str]`	The api_key for label studio.

Source code in src/zenml/config/secret_reference_mixin.py

def __init__(
    self, warn_about_plain_text_secrets: bool = False, **kwargs: Any
) -> None:
    """Ensures that secret references are only passed for valid fields.

    This method ensures that secret references are not passed for fields
    that explicitly prevent them or require pydantic validation.

    Args:
        warn_about_plain_text_secrets: If true, then warns about using plain-text secrets.
        **kwargs: Arguments to initialize this object.

    Raises:
        ValueError: If an attribute that requires custom pydantic validation
            or an attribute which explicitly disallows secret references
            is passed as a secret reference.
    """
    for key, value in kwargs.items():
        try:
            field = self.__class__.model_fields[key]
        except KeyError:
            # Value for a private attribute or non-existing field, this
            # will fail during the upcoming pydantic validation
            continue

        if value is None:
            continue

        if not secret_utils.is_secret_reference(value):
            if (
                secret_utils.is_secret_field(field)
                and warn_about_plain_text_secrets
            ):
                logger.warning(
                    "You specified a plain-text value for the sensitive "
                    f"attribute `{key}`. This is currently only a warning, "
                    "but future versions of ZenML will require you to pass "
                    "in sensitive information as secrets. Check out the "
                    "documentation on how to configure values with secrets "
                    "here: https://docs.zenml.io/getting-started/deploying-zenml/secret-management"
                )
            continue

        if secret_utils.is_clear_text_field(field):
            raise ValueError(
                f"Passing the `{key}` attribute as a secret reference is "
                "not allowed."
            )

        requires_validation = has_validators(
            pydantic_class=self.__class__, field_name=key
        )
        if requires_validation:
            raise ValueError(
                f"Passing the attribute `{key}` as a secret reference is "
                "not allowed as additional validation is required for "
                "this attribute."
            )

    super().__init__(**kwargs)

Modules

`label_studio_annotator_flavor`

Label Studio annotator flavor.

Classes

LabelStudioAnnotatorConfig(warn_about_plain_text_secrets: bool = False, **kwargs: Any)

Bases: BaseAnnotatorConfig, LabelStudioAnnotatorSettings, AuthenticationConfigMixin

Config for the Label Studio annotator.

This class combines settings and authentication configurations for Label Studio into a single, usable configuration object without adding additional functionality.

Source code in src/zenml/config/secret_reference_mixin.py

def __init__(
    self, warn_about_plain_text_secrets: bool = False, **kwargs: Any
) -> None:
    """Ensures that secret references are only passed for valid fields.

    This method ensures that secret references are not passed for fields
    that explicitly prevent them or require pydantic validation.

    Args:
        warn_about_plain_text_secrets: If true, then warns about using plain-text secrets.
        **kwargs: Arguments to initialize this object.

    Raises:
        ValueError: If an attribute that requires custom pydantic validation
            or an attribute which explicitly disallows secret references
            is passed as a secret reference.
    """
    for key, value in kwargs.items():
        try:
            field = self.__class__.model_fields[key]
        except KeyError:
            # Value for a private attribute or non-existing field, this
            # will fail during the upcoming pydantic validation
            continue

        if value is None:
            continue

        if not secret_utils.is_secret_reference(value):
            if (
                secret_utils.is_secret_field(field)
                and warn_about_plain_text_secrets
            ):
                logger.warning(
                    "You specified a plain-text value for the sensitive "
                    f"attribute `{key}`. This is currently only a warning, "
                    "but future versions of ZenML will require you to pass "
                    "in sensitive information as secrets. Check out the "
                    "documentation on how to configure values with secrets "
                    "here: https://docs.zenml.io/getting-started/deploying-zenml/secret-management"
                )
            continue

        if secret_utils.is_clear_text_field(field):
            raise ValueError(
                f"Passing the `{key}` attribute as a secret reference is "
                "not allowed."
            )

        requires_validation = has_validators(
            pydantic_class=self.__class__, field_name=key
        )
        if requires_validation:
            raise ValueError(
                f"Passing the attribute `{key}` as a secret reference is "
                "not allowed as additional validation is required for "
                "this attribute."
            )

    super().__init__(**kwargs)

LabelStudioAnnotatorFlavor

Bases: BaseAnnotatorFlavor

Label Studio annotator flavor.

Attributes

config_class: Type[LabelStudioAnnotatorConfig] property

Returns LabelStudioAnnotatorConfig config class.

Returns:

Type	Description
`Type[LabelStudioAnnotatorConfig]`	The config class.

docs_url: Optional[str] property

A url to point at docs explaining this flavor.

Returns:

Type	Description
`Optional[str]`	A flavor docs url.

implementation_class: Type[LabelStudioAnnotator] property

Implementation class for this flavor.

Returns:

Type	Description
`Type[LabelStudioAnnotator]`	The implementation class.

logo_url: str property

A url to represent the flavor in the dashboard.

Returns:

Type	Description
`str`	The flavor logo.

name: str property

Name of the flavor.

Returns:

Type	Description
`str`	The name of the flavor.

sdk_docs_url: Optional[str] property

A url to point at SDK docs explaining this flavor.

Returns:

Type	Description
`Optional[str]`	A flavor SDK docs url.

LabelStudioAnnotatorSettings(warn_about_plain_text_secrets: bool = False, **kwargs: Any)

Bases: BaseSettings

Label studio annotator settings.

Attributes:

Name	Type	Description
`instance_url`	`str`	URL of the Label Studio instance.
`port`	`Optional[int]`	The port to use for the annotation interface.
`api_key`	`Optional[str]`	The api_key for label studio.

Source code in src/zenml/config/secret_reference_mixin.py

def __init__(
    self, warn_about_plain_text_secrets: bool = False, **kwargs: Any
) -> None:
    """Ensures that secret references are only passed for valid fields.

    This method ensures that secret references are not passed for fields
    that explicitly prevent them or require pydantic validation.

    Args:
        warn_about_plain_text_secrets: If true, then warns about using plain-text secrets.
        **kwargs: Arguments to initialize this object.

    Raises:
        ValueError: If an attribute that requires custom pydantic validation
            or an attribute which explicitly disallows secret references
            is passed as a secret reference.
    """
    for key, value in kwargs.items():
        try:
            field = self.__class__.model_fields[key]
        except KeyError:
            # Value for a private attribute or non-existing field, this
            # will fail during the upcoming pydantic validation
            continue

        if value is None:
            continue

        if not secret_utils.is_secret_reference(value):
            if (
                secret_utils.is_secret_field(field)
                and warn_about_plain_text_secrets
            ):
                logger.warning(
                    "You specified a plain-text value for the sensitive "
                    f"attribute `{key}`. This is currently only a warning, "
                    "but future versions of ZenML will require you to pass "
                    "in sensitive information as secrets. Check out the "
                    "documentation on how to configure values with secrets "
                    "here: https://docs.zenml.io/getting-started/deploying-zenml/secret-management"
                )
            continue

        if secret_utils.is_clear_text_field(field):
            raise ValueError(
                f"Passing the `{key}` attribute as a secret reference is "
                "not allowed."
            )

        requires_validation = has_validators(
            pydantic_class=self.__class__, field_name=key
        )
        if requires_validation:
            raise ValueError(
                f"Passing the attribute `{key}` as a secret reference is "
                "not allowed as additional validation is required for "
                "this attribute."
            )

    super().__init__(**kwargs)

Functions

`label_config_generators`

Initialization of the Label Studio config generators submodule.

Functions

`generate_image_classification_label_config(labels: List[str]) -> Tuple[str, str]`

Generates a Label Studio label config for image classification.

This is based on the basic config example shown at https://labelstud.io/templates/image_classification.html.

Parameters:

Name	Type	Description	Default
`labels`	`List[str]`	A list of labels to be used in the label config.	required

Returns:

Type	Description
`Tuple[str, str]`	A tuple of the generated label config and the label config type.

Raises:

Type	Description
`ValueError`	If no labels are provided.

Source code in src/zenml/integrations/label_studio/label_config_generators/label_config_generators.py

def generate_image_classification_label_config(
    labels: List[str],
) -> Tuple[str, str]:
    """Generates a Label Studio label config for image classification.

    This is based on the basic config example shown at
    https://labelstud.io/templates/image_classification.html.

    Args:
        labels: A list of labels to be used in the label config.

    Returns:
        A tuple of the generated label config and the label config type.

    Raises:
        ValueError: If no labels are provided.
    """
    if not labels:
        raise ValueError("No labels provided")

    label_config_type = AnnotationTasks.IMAGE_CLASSIFICATION

    label_config_start = """<View>
    <Image name="image" value="$image"/>
    <Choices name="choice" toName="image">
    """
    label_config_choices = "".join(
        f"<Choice value='{label}' />\n" for label in labels
    )
    label_config_end = "</Choices>\n</View>"

    label_config = label_config_start + label_config_choices + label_config_end
    return (
        label_config,
        label_config_type,
    )

`generate_text_classification_label_config(labels: List[str]) -> Tuple[str, str]`

Generates a Label Studio label config for text classification.

This is based on the basic config example shown at https://labelstud.io/templates/sentiment_analysis.html.

Parameters:

Name	Type	Description	Default
`labels`	`List[str]`	A list of labels to be used in the label config.	required

Returns:

Type	Description
`Tuple[str, str]`	A tuple of the generated label config and the label config type.

Raises:

Type	Description
`ValueError`	If no labels are provided.

Source code in src/zenml/integrations/label_studio/label_config_generators/label_config_generators.py

def generate_text_classification_label_config(
    labels: List[str],
) -> Tuple[str, str]:
    """Generates a Label Studio label config for text classification.

    This is based on the basic config example shown at
    https://labelstud.io/templates/sentiment_analysis.html.

    Args:
        labels: A list of labels to be used in the label config.

    Returns:
        A tuple of the generated label config and the label config type.

    Raises:
        ValueError: If no labels are provided.
    """
    if not labels:
        raise ValueError("No labels provided")

    label_config_type = AnnotationTasks.TEXT_CLASSIFICATION

    label_config_start = """<View>
    <Header value="Choose text class:"/>
    <Text name="text" value="$text"/>
    <Choices name="class" toName="text" choice="single" showInline="true">
    """
    label_config_choices = "".join(
        f"<Choice value='{label}' />\n" for label in labels
    )
    label_config_end = "</Choices>\n</View>"

    label_config = label_config_start + label_config_choices + label_config_end
    return (
        label_config,
        label_config_type,
    )

Modules

`label_config_generators`

Implementation of label config generators for Label Studio.

Classes Functions

generate_basic_object_detection_bounding_boxes_label_config(labels: List[str]) -> Tuple[str, str]

Generates a Label Studio config for object detection with bounding boxes.

This is based on the basic config example shown at https://labelstud.io/templates/image_bbox.html.

Parameters:

Name	Type	Description	Default
`labels`	`List[str]`	A list of labels to be used in the label config.	required

Returns:

Type	Description
`Tuple[str, str]`	A tuple of the generated label config and the label config type.

Raises:

Type	Description
`ValueError`	If no labels are provided.

Source code in src/zenml/integrations/label_studio/label_config_generators/label_config_generators.py

def generate_basic_object_detection_bounding_boxes_label_config(
    labels: List[str],
) -> Tuple[str, str]:
    """Generates a Label Studio config for object detection with bounding boxes.

    This is based on the basic config example shown at
    https://labelstud.io/templates/image_bbox.html.

    Args:
        labels: A list of labels to be used in the label config.

    Returns:
        A tuple of the generated label config and the label config type.

    Raises:
        ValueError: If no labels are provided.
    """
    if not labels:
        raise ValueError("No labels provided")

    label_config_type = AnnotationTasks.OBJECT_DETECTION_BOUNDING_BOXES

    label_config_start = """<View>
    <Image name="image" value="$image"/>
    <RectangleLabels name="label" toName="image">
    """
    label_config_choices = "".join(
        f"<Label value='{label}' />\n" for label in labels
    )
    label_config_end = "</RectangleLabels>\n</View>"
    label_config = label_config_start + label_config_choices + label_config_end

    return (
        label_config,
        label_config_type,
    )

generate_basic_ocr_label_config(labels: List[str]) -> Tuple[str, str]

Generates a Label Studio config for optical character recognition (OCR) labeling task.

This is based on the basic config example shown at https://labelstud.io/templates/optical_character_recognition.html

Parameters:

Name	Type	Description	Default
`labels`	`List[str]`	A list of labels to be used in the label config.	required

Returns:

Type	Description
`Tuple[str, str]`	A tuple of the generated label config and the label config type.

Raises:

Type	Description
`ValueError`	If no labels are provided.

Source code in src/zenml/integrations/label_studio/label_config_generators/label_config_generators.py

def generate_basic_ocr_label_config(
    labels: List[str],
) -> Tuple[str, str]:
    """Generates a Label Studio config for optical character recognition (OCR) labeling task.

    This is based on the basic config example shown at
    https://labelstud.io/templates/optical_character_recognition.html

    Args:
        labels: A list of labels to be used in the label config.

    Returns:
        A tuple of the generated label config and the label config type.

    Raises:
        ValueError: If no labels are provided.
    """
    if not labels:
        raise ValueError("No labels provided")

    label_config_type = AnnotationTasks.OCR

    label_config_start = """
    <View>
    <Image name="image" value="$ocr" zoom="true" zoomControl="true" rotateControl="true"/>
    <View>
    <Filter toName="label" minlength="0" name="filter"/>
    <Labels name="label" toName="image">
    """
    label_config_choices = "".join(
        f"<Label value='{label}' />\n" for label in labels
    )

    label_config_end = """
    </Labels>
    </View>
    <Rectangle name="bbox" toName="image" strokeWidth="3"/>
    <Polygon name="poly" toName="image" strokeWidth="3"/>
    <TextArea name="transcription" toName="image" editable="true" perRegion="true" required="true" maxSubmissions="1" rows="5" placeholder="Recognized Text" displayMode="region-list"/>
    </View>
    """
    label_config = label_config_start + label_config_choices + label_config_end

    return (
        label_config,
        label_config_type,
    )

generate_image_classification_label_config(labels: List[str]) -> Tuple[str, str]

Generates a Label Studio label config for image classification.

This is based on the basic config example shown at https://labelstud.io/templates/image_classification.html.

Parameters:

Name	Type	Description	Default
`labels`	`List[str]`	A list of labels to be used in the label config.	required

Returns:

Type	Description
`Tuple[str, str]`	A tuple of the generated label config and the label config type.

Raises:

Type	Description
`ValueError`	If no labels are provided.

Source code in src/zenml/integrations/label_studio/label_config_generators/label_config_generators.py

def generate_image_classification_label_config(
    labels: List[str],
) -> Tuple[str, str]:
    """Generates a Label Studio label config for image classification.

    This is based on the basic config example shown at
    https://labelstud.io/templates/image_classification.html.

    Args:
        labels: A list of labels to be used in the label config.

    Returns:
        A tuple of the generated label config and the label config type.

    Raises:
        ValueError: If no labels are provided.
    """
    if not labels:
        raise ValueError("No labels provided")

    label_config_type = AnnotationTasks.IMAGE_CLASSIFICATION

    label_config_start = """<View>
    <Image name="image" value="$image"/>
    <Choices name="choice" toName="image">
    """
    label_config_choices = "".join(
        f"<Choice value='{label}' />\n" for label in labels
    )
    label_config_end = "</Choices>\n</View>"

    label_config = label_config_start + label_config_choices + label_config_end
    return (
        label_config,
        label_config_type,
    )

generate_text_classification_label_config(labels: List[str]) -> Tuple[str, str]

Generates a Label Studio label config for text classification.

This is based on the basic config example shown at https://labelstud.io/templates/sentiment_analysis.html.

Parameters:

Name	Type	Description	Default
`labels`	`List[str]`	A list of labels to be used in the label config.	required

Returns:

Type	Description
`Tuple[str, str]`	A tuple of the generated label config and the label config type.

Raises:

Type	Description
`ValueError`	If no labels are provided.

Source code in src/zenml/integrations/label_studio/label_config_generators/label_config_generators.py

def generate_text_classification_label_config(
    labels: List[str],
) -> Tuple[str, str]:
    """Generates a Label Studio label config for text classification.

    This is based on the basic config example shown at
    https://labelstud.io/templates/sentiment_analysis.html.

    Args:
        labels: A list of labels to be used in the label config.

    Returns:
        A tuple of the generated label config and the label config type.

    Raises:
        ValueError: If no labels are provided.
    """
    if not labels:
        raise ValueError("No labels provided")

    label_config_type = AnnotationTasks.TEXT_CLASSIFICATION

    label_config_start = """<View>
    <Header value="Choose text class:"/>
    <Text name="text" value="$text"/>
    <Choices name="class" toName="text" choice="single" showInline="true">
    """
    label_config_choices = "".join(
        f"<Choice value='{label}' />\n" for label in labels
    )
    label_config_end = "</Choices>\n</View>"

    label_config = label_config_start + label_config_choices + label_config_end
    return (
        label_config,
        label_config_type,
    )

`label_studio_utils`

Utility functions for the Label Studio annotator integration.

Functions

`clean_url(url: str) -> str`

Remove extraneous parts of the URL prior to mapping.

Removes the query and netloc parts of the URL, and strips the leading slash from the path. For example, a string like 'gs%3A//label-studio/load_image_data/images/fdbcd451-0c80-495c-a9c5-6b51776f5019/1/0/image_file.JPEG' would become label-studio/load_image_data/images/fdbcd451-0c80-495c-a9c5-6b51776f5019/1/0/image_file.JPEG.

Parameters:

Name	Type	Description	Default
`url`	`str`	A URL string.	required

Returns:

Type	Description
`str`	A cleaned URL string.

Source code in src/zenml/integrations/label_studio/label_studio_utils.py

def clean_url(url: str) -> str:
    """Remove extraneous parts of the URL prior to mapping.

    Removes the query and netloc parts of the URL, and strips the leading slash
    from the path. For example, a string like
    `'gs%3A//label-studio/load_image_data/images/fdbcd451-0c80-495c-a9c5-6b51776f5019/1/0/image_file.JPEG'`
    would become
    `label-studio/load_image_data/images/fdbcd451-0c80-495c-a9c5-6b51776f5019/1/0/image_file.JPEG`.

    Args:
        url: A URL string.

    Returns:
        A cleaned URL string.
    """
    parsed = urlparse(url)
    parsed = parsed._replace(netloc="", query="")
    return parsed.path.lstrip("/")

`convert_pred_filenames_to_task_ids(preds: List[Dict[str, Any]], tasks: List[Dict[str, Any]]) -> List[Dict[str, Any]]`

Converts a list of predictions from local file references to task id.

Parameters:

Name	Type	Description	Default
`preds`	`List[Dict[str, Any]]`	List of predictions.	required
`tasks`	`List[Dict[str, Any]]`	List of tasks.	required

Returns:

Type	Description
`List[Dict[str, Any]]`	List of predictions using task ids as reference.

Source code in src/zenml/integrations/label_studio/label_studio_utils.py

def convert_pred_filenames_to_task_ids(
    preds: List[Dict[str, Any]],
    tasks: List[Dict[str, Any]],
) -> List[Dict[str, Any]]:
    """Converts a list of predictions from local file references to task id.

    Args:
        preds: List of predictions.
        tasks: List of tasks.

    Returns:
        List of predictions using task ids as reference.
    """
    preds = [
        {
            "filename": quote(pred["filename"]).split("//")[1],
            "result": pred["result"],
        }
        for pred in preds
    ]
    filename_id_mapping = {
        clean_url(task["storage_filename"]): task["id"] for task in tasks
    }
    return [
        {
            "task": int(
                filename_id_mapping["/".join(pred["filename"].split("/")[1:])]
            ),
            "result": pred["result"],
        }
        for pred in preds
    ]

`get_file_extension(path_str: str) -> str`

Return the file extension of the given filename.

Parameters:

Name	Type	Description	Default
`path_str`	`str`	Path to the file.	required

Returns:

Type	Description
`str`	File extension.

Source code in src/zenml/integrations/label_studio/label_studio_utils.py

def get_file_extension(path_str: str) -> str:
    """Return the file extension of the given filename.

    Args:
        path_str: Path to the file.

    Returns:
        File extension.
    """
    return os.path.splitext(urlparse(path_str).path)[1]

`is_azure_url(url: str) -> bool`

Return whether the given URL is an Azure URL.

Parameters:

Name	Type	Description	Default
`url`	`str`	URL to check.	required

Returns:

Type	Description
`bool`	True if the URL is an Azure URL, False otherwise.

Source code in src/zenml/integrations/label_studio/label_studio_utils.py

def is_azure_url(url: str) -> bool:
    """Return whether the given URL is an Azure URL.

    Args:
        url: URL to check.

    Returns:
        True if the URL is an Azure URL, False otherwise.
    """
    return "blob.core.windows.net" in urlparse(url).netloc

`is_gcs_url(url: str) -> bool`

Return whether the given URL is an GCS URL.

Parameters:

Name	Type	Description	Default
`url`	`str`	URL to check.	required

Returns:

Type	Description
`bool`	True if the URL is an GCS URL, False otherwise.

Source code in src/zenml/integrations/label_studio/label_studio_utils.py

def is_gcs_url(url: str) -> bool:
    """Return whether the given URL is an GCS URL.

    Args:
        url: URL to check.

    Returns:
        True if the URL is an GCS URL, False otherwise.
    """
    return "storage.googleapis.com" in urlparse(url).netloc

`is_s3_url(url: str) -> bool`

Return whether the given URL is an S3 URL.

Parameters:

Name	Type	Description	Default
`url`	`str`	URL to check.	required

Returns:

Type	Description
`bool`	True if the URL is an S3 URL, False otherwise.

Source code in src/zenml/integrations/label_studio/label_studio_utils.py

def is_s3_url(url: str) -> bool:
    """Return whether the given URL is an S3 URL.

    Args:
        url: URL to check.

    Returns:
        True if the URL is an S3 URL, False otherwise.
    """
    return "s3.amazonaws" in urlparse(url).netloc

`steps`

Standard steps to be used with the Label Studio annotator integration.

Classes

Functions

Modules

`label_studio_standard_steps`

Implementation of standard steps for the Label Studio annotator integration.

Classes

LabelStudioDatasetSyncParameters

Bases: BaseModel

Step parameters when syncing data to Label Studio.

Attributes:

Name	Type	Description
`storage_type`	`str`	The type of storage to sync to. Can be one of ["gcs", "s3", "azure", "local"]. Defaults to "local".
`label_config_type`	`str`	The type of label config to use.
`prefix`	`Optional[str]`	Specify the prefix within the cloud store to import your data from. For local storage, this is the full absolute path to the directory containing your data.
`regex_filter`	`Optional[str]`	Specify a regex filter to filter the files to import.
`use_blob_urls`	`Optional[bool]`	Specify whether your data is raw image or video data, or JSON tasks.
`presign`	`Optional[bool]`	Specify whether or not to create presigned URLs.
`presign_ttl`	`Optional[int]`	Specify how long to keep presigned URLs active.
`description`	`Optional[str]`	Specify a description for the dataset.
`azure_account_name`	`Optional[str]`	Specify the Azure account name to use for the storage.
`azure_account_key`	`Optional[str]`	Specify the Azure account key to use for the storage.
`google_application_credentials`	`Optional[str]`	Specify the file with Google application credentials to use for the storage.
`aws_access_key_id`	`Optional[str]`	Specify the AWS access key ID to use for the storage.
`aws_secret_access_key`	`Optional[str]`	Specify the AWS secret access key to use for the storage.
`aws_session_token`	`Optional[str]`	Specify the AWS session token to use for the storage.
`s3_region_name`	`Optional[str]`	Specify the S3 region name to use for the storage.
`s3_endpoint`	`Optional[str]`	Specify the S3 endpoint to use for the storage.

Functions

get_labeled_data(dataset_name: str) -> List

Gets labeled data from the dataset.

Parameters:

Name	Type	Description	Default
`dataset_name`	`str`	Name of the dataset.	required

Returns:

Type	Description
`List`	List of labeled data.

Raises:

Type	Description
`TypeError`	If you are trying to use it with an annotator that is not Label Studio.
`StackComponentInterfaceError`	If no active annotator could be found.

Source code in src/zenml/integrations/label_studio/steps/label_studio_standard_steps.py

@step(enable_cache=False)
def get_labeled_data(dataset_name: str) -> List:  # type: ignore[type-arg]
    """Gets labeled data from the dataset.

    Args:
        dataset_name: Name of the dataset.

    Returns:
        List of labeled data.

    Raises:
        TypeError: If you are trying to use it with an annotator that is not
            Label Studio.
        StackComponentInterfaceError: If no active annotator could be found.
    """
    # TODO [MEDIUM]: have this check for new data *since the last time this step ran*
    annotator = Client().active_stack.annotator
    if not annotator:
        raise StackComponentInterfaceError("No active annotator.")
    from zenml.integrations.label_studio.annotators.label_studio_annotator import (
        LabelStudioAnnotator,
    )

    if not isinstance(annotator, LabelStudioAnnotator):
        raise TypeError(
            "This step can only be used with the Label Studio annotator."
        )
    if annotator._connection_available():
        dataset = annotator.get_dataset(dataset_name=dataset_name)
        return dataset.get_labeled_tasks()  # type: ignore[no-any-return]

    raise StackComponentInterfaceError(
        "Unable to connect to annotator stack component."
    )

get_or_create_dataset(label_config: str, dataset_name: str) -> str

Gets preexisting dataset or creates a new one.

Parameters:

Name	Type	Description	Default
`label_config`	`str`	The label config to use for the annotation interface.	required
`dataset_name`	`str`	Name of the dataset to register.	required

Returns:

Type	Description
`str`	The dataset name.

Raises:

Type	Description
`TypeError`	If you are trying to use it with an annotator that is not Label Studio.
`StackComponentInterfaceError`	If no active annotator could be found.

Source code in src/zenml/integrations/label_studio/steps/label_studio_standard_steps.py

@step(enable_cache=False)
def get_or_create_dataset(
    label_config: str,
    dataset_name: str,
) -> str:
    """Gets preexisting dataset or creates a new one.

    Args:
        label_config: The label config to use for the annotation interface.
        dataset_name: Name of the dataset to register.

    Returns:
        The dataset name.

    Raises:
        TypeError: If you are trying to use it with an annotator that is not
            Label Studio.
        StackComponentInterfaceError: If no active annotator could be found.
    """
    annotator = Client().active_stack.annotator
    from zenml.integrations.label_studio.annotators.label_studio_annotator import (
        LabelStudioAnnotator,
    )

    if not isinstance(annotator, LabelStudioAnnotator):
        raise TypeError(
            "This step can only be used with the Label Studio annotator."
        )

    if annotator and annotator._connection_available():
        for dataset in annotator.get_datasets():
            if dataset.get_params()["title"] == dataset_name:
                return cast(str, dataset.get_params()["title"])

        dataset = annotator.register_dataset_for_annotation(
            label_config=label_config,
            dataset_name=dataset_name,
        )
        return cast(str, dataset.get_params()["title"])

    raise StackComponentInterfaceError("No active annotator.")

sync_new_data_to_label_studio(uri: str, dataset_name: str, predictions: List[Dict[str, Any]], params: LabelStudioDatasetSyncParameters) -> None

Syncs new data to Label Studio.

Parameters:

Name	Type	Description	Default
`uri`	`str`	The URI of the data to sync.	required
`dataset_name`	`str`	The name of the dataset to sync to.	required
`predictions`	`List[Dict[str, Any]]`	The predictions to sync.	required
`params`	`LabelStudioDatasetSyncParameters`	The parameters for the sync.	required

Raises:

Type	Description
`TypeError`	If you are trying to use it with an annotator that is not Label Studio.
`ValueError`	if you are trying to sync from outside ZenML.
`StackComponentInterfaceError`	If no active annotator could be found.

Source code in src/zenml/integrations/label_studio/steps/label_studio_standard_steps.py

@step(enable_cache=False)
def sync_new_data_to_label_studio(
    uri: str,
    dataset_name: str,
    predictions: List[Dict[str, Any]],
    params: LabelStudioDatasetSyncParameters,
) -> None:
    """Syncs new data to Label Studio.

    Args:
        uri: The URI of the data to sync.
        dataset_name: The name of the dataset to sync to.
        predictions: The predictions to sync.
        params: The parameters for the sync.

    Raises:
        TypeError: If you are trying to use it with an annotator that is not
            Label Studio.
        ValueError: if you are trying to sync from outside ZenML.
        StackComponentInterfaceError: If no active annotator could be found.
    """
    stack = Client().active_stack
    annotator = stack.annotator
    artifact_store = stack.artifact_store
    if not annotator or not artifact_store:
        raise StackComponentInterfaceError(
            "An active annotator and artifact store are required to run this step."
        )

    from zenml.integrations.label_studio.annotators.label_studio_annotator import (
        LabelStudioAnnotator,
    )

    if not isinstance(annotator, LabelStudioAnnotator):
        raise TypeError(
            "This step can only be used with the Label Studio annotator."
        )

    # TODO: check that annotator is connected before querying it
    dataset = annotator.get_dataset(dataset_name=dataset_name)
    if not uri.startswith(artifact_store.path):
        raise ValueError(
            "ZenML only currently supports syncing data passed from other ZenML steps and via the Artifact Store."
        )
    # removes the initial forward slash from the prefix attribute by slicing
    params.prefix = urlparse(uri).path.lstrip("/")
    base_uri = urlparse(uri).netloc

    annotator.populate_artifact_store_parameters(
        params=params, artifact_store=artifact_store
    )

    if annotator and annotator._connection_available():
        # TODO: get existing (CHECK!) or create the sync connection
        annotator.connect_and_sync_external_storage(
            uri=base_uri,
            params=params,
            dataset=dataset,
        )
        if predictions:
            preds_with_task_ids = convert_pred_filenames_to_task_ids(
                predictions,
                dataset.tasks,
            )
            # TODO: filter out any predictions that exist + have already been
            # made (maybe?). Only pass in preds for tasks without pre-annotations.
            dataset.create_predictions(preds_with_task_ids)
    else:
        raise StackComponentInterfaceError("No active annotator.")

Label Studio

zenml.integrations.label_studio

Attributes

LABEL_STUDIO = 'label_studio' module-attribute

LABEL_STUDIO_ANNOTATOR_FLAVOR = 'label_studio' module-attribute

Classes

Flavor

Attributes

config_class: Type[StackComponentConfig] abstractmethod property

config_schema: Dict[str, Any] property

docs_url: Optional[str] property

implementation_class: Type[StackComponent] abstractmethod property

logo_url: Optional[str] property

name: str abstractmethod property

sdk_docs_url: Optional[str] property

service_connector_requirements: Optional[ServiceConnectorRequirements] property

type: StackComponentType abstractmethod property

Functions

from_model(flavor_model: FlavorResponse) -> Flavor classmethod

generate_default_docs_url() -> str

generate_default_sdk_docs_url() -> str

to_model(integration: Optional[str] = None, is_custom: bool = True) -> FlavorRequest

Integration

Functions

activate() -> None classmethod

check_installation() -> bool classmethod

flavors() -> List[Type[Flavor]] classmethod

get_requirements(target_os: Optional[str] = None) -> List[str] classmethod

get_uninstall_requirements(target_os: Optional[str] = None) -> List[str] classmethod

plugin_flavors() -> List[Type[BasePluginFlavor]] classmethod

LabelStudioIntegration

Functions

flavors() -> List[Type[Flavor]] classmethod

Modules

annotators

Classes

Modules

label_studio_annotator

flavors

Classes

LabelStudioAnnotatorConfig(warn_about_plain_text_secrets: bool = False, **kwargs: Any)

LabelStudioAnnotatorFlavor

LabelStudioAnnotatorSettings(warn_about_plain_text_secrets: bool = False, **kwargs: Any)

Modules

label_studio_annotator_flavor

label_config_generators

Functions

generate_image_classification_label_config(labels: List[str]) -> Tuple[str, str]

generate_text_classification_label_config(labels: List[str]) -> Tuple[str, str]

Modules

label_config_generators

label_studio_utils

Functions

clean_url(url: str) -> str

convert_pred_filenames_to_task_ids(preds: List[Dict[str, Any]], tasks: List[Dict[str, Any]]) -> List[Dict[str, Any]]

get_file_extension(path_str: str) -> str

is_azure_url(url: str) -> bool

is_gcs_url(url: str) -> bool

is_s3_url(url: str) -> bool

steps

Classes

Functions

Modules

label_studio_standard_steps

`zenml.integrations.label_studio`

`LABEL_STUDIO = 'label_studio'` `module-attribute`

`LABEL_STUDIO_ANNOTATOR_FLAVOR = 'label_studio'` `module-attribute`

`Flavor`

`config_class: Type[StackComponentConfig]` `abstractmethod` `property`

`config_schema: Dict[str, Any]` `property`

`docs_url: Optional[str]` `property`

`implementation_class: Type[StackComponent]` `abstractmethod` `property`

`logo_url: Optional[str]` `property`

`name: str` `abstractmethod` `property`

`sdk_docs_url: Optional[str]` `property`

`service_connector_requirements: Optional[ServiceConnectorRequirements]` `property`

`type: StackComponentType` `abstractmethod` `property`

`from_model(flavor_model: FlavorResponse) -> Flavor` `classmethod`

`generate_default_docs_url() -> str`

`generate_default_sdk_docs_url() -> str`

`to_model(integration: Optional[str] = None, is_custom: bool = True) -> FlavorRequest`

`Integration`

`activate() -> None` `classmethod`

`check_installation() -> bool` `classmethod`

`flavors() -> List[Type[Flavor]]` `classmethod`

`get_requirements(target_os: Optional[str] = None) -> List[str]` `classmethod`

`get_uninstall_requirements(target_os: Optional[str] = None) -> List[str]` `classmethod`

`plugin_flavors() -> List[Type[BasePluginFlavor]]` `classmethod`

`LabelStudioIntegration`

`flavors() -> List[Type[Flavor]]` `classmethod`

`annotators`

`label_studio_annotator`

`flavors`

`LabelStudioAnnotatorConfig(warn_about_plain_text_secrets: bool = False, **kwargs: Any)`

`LabelStudioAnnotatorFlavor`

`LabelStudioAnnotatorSettings(warn_about_plain_text_secrets: bool = False, **kwargs: Any)`

`label_studio_annotator_flavor`

`label_config_generators`

`generate_image_classification_label_config(labels: List[str]) -> Tuple[str, str]`

`generate_text_classification_label_config(labels: List[str]) -> Tuple[str, str]`

`label_config_generators`

`label_studio_utils`

`clean_url(url: str) -> str`

`convert_pred_filenames_to_task_ids(preds: List[Dict[str, Any]], tasks: List[Dict[str, Any]]) -> List[Dict[str, Any]]`

`get_file_extension(path_str: str) -> str`

`is_azure_url(url: str) -> bool`

`is_gcs_url(url: str) -> bool`

`is_s3_url(url: str) -> bool`

`steps`

`label_studio_standard_steps`