Spark
zenml.integrations.spark
The Spark integration module to enable distributed processing for steps.
Attributes
SPARK = 'spark'
module-attribute
SPARK_KUBERNETES_STEP_OPERATOR = 'spark-kubernetes'
module-attribute
Classes
Flavor
Class for ZenML Flavors.
Attributes
config_class: Type[StackComponentConfig]
abstractmethod
property
Returns StackComponentConfig
config class.
Returns:
Type | Description |
---|---|
Type[StackComponentConfig]
|
The config class. |
config_schema: Dict[str, Any]
property
The config schema for a flavor.
Returns:
Type | Description |
---|---|
Dict[str, Any]
|
The config schema. |
docs_url: Optional[str]
property
A url to point at docs explaining this flavor.
Returns:
Type | Description |
---|---|
Optional[str]
|
A flavor docs url. |
implementation_class: Type[StackComponent]
abstractmethod
property
Implementation class for this flavor.
Returns:
Type | Description |
---|---|
Type[StackComponent]
|
The implementation class for this flavor. |
logo_url: Optional[str]
property
A url to represent the flavor in the dashboard.
Returns:
Type | Description |
---|---|
Optional[str]
|
The flavor logo. |
name: str
abstractmethod
property
The flavor name.
Returns:
Type | Description |
---|---|
str
|
The flavor name. |
sdk_docs_url: Optional[str]
property
A url to point at SDK docs explaining this flavor.
Returns:
Type | Description |
---|---|
Optional[str]
|
A flavor SDK docs url. |
service_connector_requirements: Optional[ServiceConnectorRequirements]
property
Service connector resource requirements for service connectors.
Specifies resource requirements that are used to filter the available service connector types that are compatible with this flavor.
Returns:
Type | Description |
---|---|
Optional[ServiceConnectorRequirements]
|
Requirements for compatible service connectors, if a service |
Optional[ServiceConnectorRequirements]
|
connector is required for this flavor. |
type: StackComponentType
abstractmethod
property
Functions
from_model(flavor_model: FlavorResponse) -> Flavor
classmethod
Loads a flavor from a model.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
flavor_model
|
FlavorResponse
|
The model to load from. |
required |
Raises:
Type | Description |
---|---|
CustomFlavorImportError
|
If the custom flavor can't be imported. |
ImportError
|
If the flavor can't be imported. |
Returns:
Type | Description |
---|---|
Flavor
|
The loaded flavor. |
Source code in src/zenml/stack/flavor.py
122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 |
|
generate_default_docs_url() -> str
Generate the doc urls for all inbuilt and integration flavors.
Note that this method is not going to be useful for custom flavors, which do not have any docs in the main zenml docs.
Returns:
Type | Description |
---|---|
str
|
The complete url to the zenml documentation |
Source code in src/zenml/stack/flavor.py
206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 |
|
generate_default_sdk_docs_url() -> str
Generate SDK docs url for a flavor.
Returns:
Type | Description |
---|---|
str
|
The complete url to the zenml SDK docs |
Source code in src/zenml/stack/flavor.py
232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 |
|
to_model(integration: Optional[str] = None, is_custom: bool = True) -> FlavorRequest
Converts a flavor to a model.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
integration
|
Optional[str]
|
The integration to use for the model. |
None
|
is_custom
|
bool
|
Whether the flavor is a custom flavor. |
True
|
Returns:
Type | Description |
---|---|
FlavorRequest
|
The model. |
Source code in src/zenml/stack/flavor.py
159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 |
|
Integration
Base class for integration in ZenML.
Functions
activate() -> None
classmethod
Abstract method to activate the integration.
Source code in src/zenml/integrations/integration.py
175 176 177 |
|
check_installation() -> bool
classmethod
Method to check whether the required packages are installed.
Returns:
Type | Description |
---|---|
bool
|
True if all required packages are installed, False otherwise. |
Source code in src/zenml/integrations/integration.py
65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 |
|
flavors() -> List[Type[Flavor]]
classmethod
Abstract method to declare new stack component flavors.
Returns:
Type | Description |
---|---|
List[Type[Flavor]]
|
A list of new stack component flavors. |
Source code in src/zenml/integrations/integration.py
179 180 181 182 183 184 185 186 |
|
get_requirements(target_os: Optional[str] = None, python_version: Optional[str] = None) -> List[str]
classmethod
Method to get the requirements for the integration.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
target_os
|
Optional[str]
|
The target operating system to get the requirements for. |
None
|
python_version
|
Optional[str]
|
The Python version to use for the requirements. |
None
|
Returns:
Type | Description |
---|---|
List[str]
|
A list of requirements. |
Source code in src/zenml/integrations/integration.py
135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 |
|
get_uninstall_requirements(target_os: Optional[str] = None) -> List[str]
classmethod
Method to get the uninstall requirements for the integration.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
target_os
|
Optional[str]
|
The target operating system to get the requirements for. |
None
|
Returns:
Type | Description |
---|---|
List[str]
|
A list of requirements. |
Source code in src/zenml/integrations/integration.py
152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 |
|
plugin_flavors() -> List[Type[BasePluginFlavor]]
classmethod
Abstract method to declare new plugin flavors.
Returns:
Type | Description |
---|---|
List[Type[BasePluginFlavor]]
|
A list of new plugin flavors. |
Source code in src/zenml/integrations/integration.py
188 189 190 191 192 193 194 195 |
|
SparkIntegration
Bases: Integration
Definition of Spark integration for ZenML.
Functions
activate() -> None
classmethod
Activating the corresponding Spark materializers.
Source code in src/zenml/integrations/spark/__init__.py
33 34 35 36 |
|
flavors() -> List[Type[Flavor]]
classmethod
Declare the stack component flavors for the Spark integration.
Returns:
Type | Description |
---|---|
List[Type[Flavor]]
|
The flavor wrapper for the step operator flavor |
Source code in src/zenml/integrations/spark/__init__.py
38 39 40 41 42 43 44 45 46 47 48 49 |
|
StackComponentType
Bases: StrEnum
All possible types a StackComponent
can have.
Attributes
plural: str
property
Returns the plural of the enum value.
Returns:
Type | Description |
---|---|
str
|
The plural of the enum value. |
Modules
flavors
Spark integration flavors.
Classes
KubernetesSparkStepOperatorConfig(warn_about_plain_text_secrets: bool = False, **kwargs: Any)
Bases: SparkStepOperatorConfig
Config for the Kubernetes Spark step operator.
Attributes:
Name | Type | Description |
---|---|---|
namespace |
Optional[str]
|
the namespace under which the driver and executor pods will run. |
service_account |
Optional[str]
|
the service account that will be used by various Spark components (to create and watch the pods). |
Source code in src/zenml/stack/stack_component.py
61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 |
|
KubernetesSparkStepOperatorFlavor
Bases: SparkStepOperatorFlavor
Flavor for the Kubernetes Spark step operator.
config_class: Type[KubernetesSparkStepOperatorConfig]
property
Returns KubernetesSparkStepOperatorConfig
config class.
Returns:
Type | Description |
---|---|
Type[KubernetesSparkStepOperatorConfig]
|
The config class. |
docs_url: Optional[str]
property
A url to point at docs explaining this flavor.
Returns:
Type | Description |
---|---|
Optional[str]
|
A flavor docs url. |
implementation_class: Type[KubernetesSparkStepOperator]
property
Implementation class for this flavor.
Returns:
Type | Description |
---|---|
Type[KubernetesSparkStepOperator]
|
The implementation class. |
logo_url: str
property
A url to represent the flavor in the dashboard.
Returns:
Type | Description |
---|---|
str
|
The flavor logo. |
name: str
property
Name of the flavor.
Returns:
Type | Description |
---|---|
str
|
The name of the flavor. |
sdk_docs_url: Optional[str]
property
A url to point at SDK docs explaining this flavor.
Returns:
Type | Description |
---|---|
Optional[str]
|
A flavor SDK docs url. |
SparkStepOperatorConfig(warn_about_plain_text_secrets: bool = False, **kwargs: Any)
Bases: BaseStepOperatorConfig
, SparkStepOperatorSettings
Spark step operator config.
Attributes:
Name | Type | Description |
---|---|---|
master |
str
|
is the master URL for the cluster. You might see different schemes for different cluster managers which are supported by Spark like Mesos, YARN, or Kubernetes. Within the context of this PR, the implementation supports Kubernetes as a cluster manager. |
Source code in src/zenml/stack/stack_component.py
61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 |
|
SparkStepOperatorFlavor
Bases: BaseStepOperatorFlavor
Spark step operator flavor.
config_class: Type[SparkStepOperatorConfig]
property
Returns SparkStepOperatorConfig
config class.
Returns:
Type | Description |
---|---|
Type[SparkStepOperatorConfig]
|
The config class. |
docs_url: Optional[str]
property
A url to point at docs explaining this flavor.
Returns:
Type | Description |
---|---|
Optional[str]
|
A flavor docs url. |
implementation_class: Type[SparkStepOperator]
property
Implementation class for this flavor.
Returns:
Type | Description |
---|---|
Type[SparkStepOperator]
|
The implementation class. |
name: str
property
Name of the flavor.
Returns:
Type | Description |
---|---|
str
|
The name of the flavor. |
sdk_docs_url: Optional[str]
property
A url to point at SDK docs explaining this flavor.
Returns:
Type | Description |
---|---|
Optional[str]
|
A flavor SDK docs url. |
Modules
spark_on_kubernetes_step_operator_flavor
Spark on Kubernetes step operator flavor.
KubernetesSparkStepOperatorConfig(warn_about_plain_text_secrets: bool = False, **kwargs: Any)
Bases: SparkStepOperatorConfig
Config for the Kubernetes Spark step operator.
Attributes:
Name | Type | Description |
---|---|---|
namespace |
Optional[str]
|
the namespace under which the driver and executor pods will run. |
service_account |
Optional[str]
|
the service account that will be used by various Spark components (to create and watch the pods). |
Source code in src/zenml/stack/stack_component.py
61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 |
|
KubernetesSparkStepOperatorFlavor
Bases: SparkStepOperatorFlavor
Flavor for the Kubernetes Spark step operator.
config_class: Type[KubernetesSparkStepOperatorConfig]
property
Returns KubernetesSparkStepOperatorConfig
config class.
Returns:
Type | Description |
---|---|
Type[KubernetesSparkStepOperatorConfig]
|
The config class. |
docs_url: Optional[str]
property
A url to point at docs explaining this flavor.
Returns:
Type | Description |
---|---|
Optional[str]
|
A flavor docs url. |
implementation_class: Type[KubernetesSparkStepOperator]
property
Implementation class for this flavor.
Returns:
Type | Description |
---|---|
Type[KubernetesSparkStepOperator]
|
The implementation class. |
logo_url: str
property
A url to represent the flavor in the dashboard.
Returns:
Type | Description |
---|---|
str
|
The flavor logo. |
name: str
property
Name of the flavor.
Returns:
Type | Description |
---|---|
str
|
The name of the flavor. |
sdk_docs_url: Optional[str]
property
A url to point at SDK docs explaining this flavor.
Returns:
Type | Description |
---|---|
Optional[str]
|
A flavor SDK docs url. |
spark_step_operator_flavor
Spark step operator flavor.
SparkStepOperatorConfig(warn_about_plain_text_secrets: bool = False, **kwargs: Any)
Bases: BaseStepOperatorConfig
, SparkStepOperatorSettings
Spark step operator config.
Attributes:
Name | Type | Description |
---|---|---|
master |
str
|
is the master URL for the cluster. You might see different schemes for different cluster managers which are supported by Spark like Mesos, YARN, or Kubernetes. Within the context of this PR, the implementation supports Kubernetes as a cluster manager. |
Source code in src/zenml/stack/stack_component.py
61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 |
|
SparkStepOperatorFlavor
Bases: BaseStepOperatorFlavor
Spark step operator flavor.
config_class: Type[SparkStepOperatorConfig]
property
Returns SparkStepOperatorConfig
config class.
Returns:
Type | Description |
---|---|
Type[SparkStepOperatorConfig]
|
The config class. |
docs_url: Optional[str]
property
A url to point at docs explaining this flavor.
Returns:
Type | Description |
---|---|
Optional[str]
|
A flavor docs url. |
implementation_class: Type[SparkStepOperator]
property
Implementation class for this flavor.
Returns:
Type | Description |
---|---|
Type[SparkStepOperator]
|
The implementation class. |
name: str
property
Name of the flavor.
Returns:
Type | Description |
---|---|
str
|
The name of the flavor. |
sdk_docs_url: Optional[str]
property
A url to point at SDK docs explaining this flavor.
Returns:
Type | Description |
---|---|
Optional[str]
|
A flavor SDK docs url. |
SparkStepOperatorSettings(warn_about_plain_text_secrets: bool = False, **kwargs: Any)
Bases: BaseSettings
Spark step operator settings.
Attributes:
Name | Type | Description |
---|---|---|
deploy_mode |
str
|
can either be 'cluster' (default) or 'client' and it decides where the driver node of the application will run. |
submit_kwargs |
Optional[Dict[str, Any]]
|
is the JSON string of a dict, which will be used to define additional params if required (Spark has quite a lot of different parameters, so including them, all in the step operator was not implemented). |
Source code in src/zenml/config/secret_reference_mixin.py
30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 |
|
materializers
Spark Materializers.
Classes
Modules
spark_dataframe_materializer
Implementation of the Spark Dataframe Materializer.
SparkDataFrameMaterializer(uri: str, artifact_store: Optional[BaseArtifactStore] = None)
Bases: BaseMaterializer
Materializer to read/write Spark dataframes.
Source code in src/zenml/materializers/base_materializer.py
125 126 127 128 129 130 131 132 133 134 135 |
|
extract_metadata(df: DataFrame) -> Dict[str, MetadataType]
Extract metadata from the given DataFrame
object.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
df
|
DataFrame
|
The |
required |
Returns:
Type | Description |
---|---|
Dict[str, MetadataType]
|
The extracted metadata as a dictionary. |
Source code in src/zenml/integrations/spark/materializers/spark_dataframe_materializer.py
60 61 62 63 64 65 66 67 68 69 70 71 |
|
load(data_type: Type[Any]) -> DataFrame
Reads and returns a spark dataframe.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data_type
|
Type[Any]
|
The type of the data to read. |
required |
Returns:
Type | Description |
---|---|
DataFrame
|
A loaded spark dataframe. |
Source code in src/zenml/integrations/spark/materializers/spark_dataframe_materializer.py
34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 |
|
save(df: DataFrame) -> None
Writes a spark dataframe.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
df
|
DataFrame
|
A spark dataframe object. |
required |
Source code in src/zenml/integrations/spark/materializers/spark_dataframe_materializer.py
50 51 52 53 54 55 56 57 58 |
|
spark_model_materializer
Implementation of the Spark Model Materializer.
SparkModelMaterializer(uri: str, artifact_store: Optional[BaseArtifactStore] = None)
Bases: BaseMaterializer
Materializer to read/write Spark models.
Source code in src/zenml/materializers/base_materializer.py
125 126 127 128 129 130 131 132 133 134 135 |
|
load(model_type: Type[Any]) -> Union[Transformer, Estimator, Model]
Reads and returns a Spark ML model.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_type
|
Type[Any]
|
The type of the model to read. |
required |
Returns:
Type | Description |
---|---|
Union[Transformer, Estimator, Model]
|
A loaded spark model. |
Source code in src/zenml/integrations/spark/materializers/spark_model_materializer.py
37 38 39 40 41 42 43 44 45 46 47 48 49 |
|
save(model: Union[Transformer, Estimator, Model]) -> None
Writes a spark model.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model
|
Union[Transformer, Estimator, Model]
|
A spark model. |
required |
Source code in src/zenml/integrations/spark/materializers/spark_model_materializer.py
51 52 53 54 55 56 57 58 59 60 61 62 |
|
step_operators
Spark Step Operators.
Classes
KubernetesSparkStepOperator(name: str, id: UUID, config: StackComponentConfig, flavor: str, type: StackComponentType, user: Optional[UUID], created: datetime, updated: datetime, labels: Optional[Dict[str, Any]] = None, connector_requirements: Optional[ServiceConnectorRequirements] = None, connector: Optional[UUID] = None, connector_resource_id: Optional[str] = None, *args: Any, **kwargs: Any)
Bases: SparkStepOperator
Step operator which runs Steps with Spark on Kubernetes.
Source code in src/zenml/stack/stack_component.py
328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 |
|
application_path: Any
property
Provides the application path in the corresponding docker image.
Returns:
Type | Description |
---|---|
Any
|
The path to the application entrypoint within the docker image |
config: KubernetesSparkStepOperatorConfig
property
Returns the KubernetesSparkStepOperatorConfig
config.
Returns:
Type | Description |
---|---|
KubernetesSparkStepOperatorConfig
|
The configuration. |
validator: Optional[StackValidator]
property
Validates the stack.
Returns:
Type | Description |
---|---|
Optional[StackValidator]
|
A validator that checks that the stack contains a remote container |
Optional[StackValidator]
|
registry and a remote artifact store. |
get_docker_builds(deployment: PipelineDeploymentBase) -> List[BuildConfiguration]
Gets the Docker builds required for the component.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
deployment
|
PipelineDeploymentBase
|
The pipeline deployment for which to get the builds. |
required |
Returns:
Type | Description |
---|---|
List[BuildConfiguration]
|
The required Docker builds. |
Source code in src/zenml/integrations/spark/step_operators/kubernetes_step_operator.py
111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 |
|
Modules
kubernetes_step_operator
Implementation of the Kubernetes Spark Step Operator.
KubernetesSparkStepOperator(name: str, id: UUID, config: StackComponentConfig, flavor: str, type: StackComponentType, user: Optional[UUID], created: datetime, updated: datetime, labels: Optional[Dict[str, Any]] = None, connector_requirements: Optional[ServiceConnectorRequirements] = None, connector: Optional[UUID] = None, connector_resource_id: Optional[str] = None, *args: Any, **kwargs: Any)
Bases: SparkStepOperator
Step operator which runs Steps with Spark on Kubernetes.
Source code in src/zenml/stack/stack_component.py
328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 |
|
application_path: Any
property
Provides the application path in the corresponding docker image.
Returns:
Type | Description |
---|---|
Any
|
The path to the application entrypoint within the docker image |
config: KubernetesSparkStepOperatorConfig
property
Returns the KubernetesSparkStepOperatorConfig
config.
Returns:
Type | Description |
---|---|
KubernetesSparkStepOperatorConfig
|
The configuration. |
validator: Optional[StackValidator]
property
Validates the stack.
Returns:
Type | Description |
---|---|
Optional[StackValidator]
|
A validator that checks that the stack contains a remote container |
Optional[StackValidator]
|
registry and a remote artifact store. |
get_docker_builds(deployment: PipelineDeploymentBase) -> List[BuildConfiguration]
Gets the Docker builds required for the component.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
deployment
|
PipelineDeploymentBase
|
The pipeline deployment for which to get the builds. |
required |
Returns:
Type | Description |
---|---|
List[BuildConfiguration]
|
The required Docker builds. |
Source code in src/zenml/integrations/spark/step_operators/kubernetes_step_operator.py
111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 |
|
spark_entrypoint_configuration
Spark step operator entrypoint configuration.
SparkEntrypointConfiguration(arguments: List[str])
Bases: StepOperatorEntrypointConfiguration
Entrypoint configuration for the Spark step operator.
Source code in src/zenml/entrypoints/base_entrypoint_configuration.py
60 61 62 63 64 65 66 |
|
run() -> None
Runs the entrypoint configuration.
This prepends the directory containing the source files to the python path so that spark can find them.
Source code in src/zenml/integrations/spark/step_operators/spark_entrypoint_configuration.py
26 27 28 29 30 31 32 33 |
|
spark_step_operator
Implementation of the Spark Step Operator.
SparkStepOperator(name: str, id: UUID, config: StackComponentConfig, flavor: str, type: StackComponentType, user: Optional[UUID], created: datetime, updated: datetime, labels: Optional[Dict[str, Any]] = None, connector_requirements: Optional[ServiceConnectorRequirements] = None, connector: Optional[UUID] = None, connector_resource_id: Optional[str] = None, *args: Any, **kwargs: Any)
Bases: BaseStepOperator
Base class for all Spark-related step operators.
Source code in src/zenml/stack/stack_component.py
328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 |
|
application_path: Optional[str]
property
Optional method for providing the application path.
This is especially critical when using 'spark-submit' as it defines the path (to the application in the environment where Spark is running) which is used within the command.
For more information on how to set this property please check:
https://spark.apache.org/docs/latest/submitting-applications.html#advanced-dependency-management
Returns:
Type | Description |
---|---|
Optional[str]
|
The path to the application entrypoint |
config: SparkStepOperatorConfig
property
Returns the SparkStepOperatorConfig
config.
Returns:
Type | Description |
---|---|
SparkStepOperatorConfig
|
The configuration. |
settings_class: Optional[Type[BaseSettings]]
property
Settings class for the Spark step operator.
Returns:
Type | Description |
---|---|
Optional[Type[BaseSettings]]
|
The settings class. |
launch(info: StepRunInfo, entrypoint_command: List[str], environment: Dict[str, str]) -> None
Launches a step on Spark.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
info
|
StepRunInfo
|
Information about the step run. |
required |
entrypoint_command
|
List[str]
|
Command that executes the step. |
required |
environment
|
Dict[str, str]
|
Environment variables to set in the step operator environment. |
required |
Source code in src/zenml/integrations/spark/step_operators/spark_step_operator.py
269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 |
|