Skypilot
zenml.integrations.skypilot
Modules
flavors
Modules
skypilot_orchestrator_base_vm_config
Skypilot orchestrator base config and settings.
SkypilotBaseOrchestratorConfig(warn_about_plain_text_secrets: bool = False, **kwargs: Any)
Bases: BaseOrchestratorConfig
, SkypilotBaseOrchestratorSettings
Skypilot orchestrator base config.
Attributes:
Name | Type | Description |
---|---|---|
disable_step_based_settings |
bool
|
whether to disable step-based settings. If True, the orchestrator will run all steps with the pipeline settings in one single VM. If False, the orchestrator will run each step with its own settings in separate VMs if provided. |
Source code in src/zenml/stack/stack_component.py
61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 |
|
is_local: bool
property
Checks if this stack component is running locally.
Returns:
Type | Description |
---|---|
bool
|
True if this config is for a local component, False otherwise. |
supports_client_side_caching: bool
property
Whether the orchestrator supports client side caching.
Returns:
Type | Description |
---|---|
bool
|
Whether the orchestrator supports client side caching. |
SkypilotBaseOrchestratorSettings(warn_about_plain_text_secrets: bool = False, **kwargs: Any)
Bases: BaseSettings
Skypilot orchestrator base settings.
Attributes:
Name | Type | Description |
---|---|---|
instance_type |
Optional[str]
|
the instance type to use. |
cpus |
Union[None, int, float, str]
|
the number of CPUs required for the task.
If a str, must be a string of the form |
memory |
Union[None, int, float, str]
|
the amount of memory in GiB required. If a
str, must be a string of the form |
accelerators |
Union[None, str, Dict[str, int]]
|
the accelerators required. If a str, must be
a string of the form |
accelerator_args |
Optional[Dict[str, str]]
|
accelerator-specific arguments. For example,
|
use_spot |
Optional[bool]
|
whether to use spot instances. If None, defaults to False. |
job_recovery |
Optional[str]
|
the spot recovery strategy to use for the managed
spot to recover the cluster from preemption. Refer to
|
region |
Optional[str]
|
the region to use. |
zone |
Optional[str]
|
the zone to use. |
image_id |
Union[Dict[str, str], str, None]
|
the image ID to use. If a str, must be a string
of the image id from the cloud, such as AWS:
.. code-block:: python
|
disk_size |
Optional[int]
|
the size of the OS disk in GiB. |
disk_tier |
Optional[Literal['high', 'medium', 'low']]
|
the disk performance tier to use. If None, defaults to
|
cluster_name |
Optional[str]
|
name of the cluster to create/reuse. If None, auto-generate a name. |
retry_until_up |
bool
|
whether to retry launching the cluster until it is up. |
idle_minutes_to_autostop |
Optional[int]
|
automatically stop the cluster after this
many minute of idleness, i.e., no running or pending jobs in the
cluster's job queue. Idleness gets reset whenever setting-up/
running/pending jobs are found in the job queue. Setting this
flag is equivalent to running
|
down |
bool
|
Tear down the cluster after all jobs finish (successfully or abnormally). If --idle-minutes-to-autostop is also set, the cluster will be torn down after the specified idle time. Note that if errors occur during provisioning/data syncing/setting up, the cluster will not be torn down for debugging purposes. |
stream_logs |
bool
|
if True, show the logs in the terminal. |
docker_run_args |
List[str]
|
Optional arguments to pass to the |
Source code in src/zenml/config/secret_reference_mixin.py
30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 |
|
orchestrators
Initialization of the Skypilot ZenML orchestrators.
Classes
SkypilotBaseOrchestrator(name: str, id: UUID, config: StackComponentConfig, flavor: str, type: StackComponentType, user: Optional[UUID], workspace: UUID, created: datetime, updated: datetime, labels: Optional[Dict[str, Any]] = None, connector_requirements: Optional[ServiceConnectorRequirements] = None, connector: Optional[UUID] = None, connector_resource_id: Optional[str] = None, *args: Any, **kwargs: Any)
Bases: ContainerizedOrchestrator
Base class for Orchestrator responsible for running pipelines remotely in a VM.
This orchestrator does not support running on a schedule.
Source code in src/zenml/stack/stack_component.py
328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 |
|
cloud: sky.clouds.Cloud
abstractmethod
property
The type of sky cloud to use.
Returns:
Type | Description |
---|---|
Cloud
|
A |
config: SkypilotBaseOrchestratorConfig
property
Returns the SkypilotBaseOrchestratorConfig
config.
Returns:
Type | Description |
---|---|
SkypilotBaseOrchestratorConfig
|
The configuration. |
validator: Optional[StackValidator]
property
Validates the stack.
In the remote case, checks that the stack contains a container registry, image builder and only remote components.
Returns:
Type | Description |
---|---|
Optional[StackValidator]
|
A |
get_orchestrator_run_id() -> str
Returns the active orchestrator run id.
Raises:
Type | Description |
---|---|
RuntimeError
|
If the environment variable specifying the run id is not set. |
Returns:
Type | Description |
---|---|
str
|
The orchestrator run id. |
Source code in src/zenml/integrations/skypilot/orchestrators/skypilot_base_vm_orchestrator.py
97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 |
|
prepare_environment_variable(set: bool = True) -> None
abstractmethod
Set up Environment variables that are required for the orchestrator.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
set
|
bool
|
Whether to set the environment variables or not. |
True
|
Source code in src/zenml/integrations/skypilot/orchestrators/skypilot_base_vm_orchestrator.py
139 140 141 142 143 144 145 |
|
prepare_or_run_pipeline(deployment: PipelineDeploymentResponse, stack: Stack, environment: Dict[str, str]) -> Any
Runs each pipeline step in a separate Skypilot container.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
deployment
|
PipelineDeploymentResponse
|
The pipeline deployment to prepare or run. |
required |
stack
|
Stack
|
The stack the pipeline will run on. |
required |
environment
|
Dict[str, str]
|
Environment variables to set in the orchestration environment. |
required |
Raises:
Type | Description |
---|---|
Exception
|
If the pipeline run fails. |
RuntimeError
|
If the code is running in a notebook. |
Source code in src/zenml/integrations/skypilot/orchestrators/skypilot_base_vm_orchestrator.py
147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 |
|
sanitize_cluster_name(name: str) -> str
Sanitize the value to be used in a cluster name.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name
|
str
|
Arbitrary input cluster name. |
required |
Returns:
Type | Description |
---|---|
str
|
Sanitized cluster name. |
Source code in src/zenml/integrations/skypilot/orchestrators/skypilot_base_vm_orchestrator.py
363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 |
|
setup_credentials() -> None
Set up credentials for the orchestrator.
Source code in src/zenml/integrations/skypilot/orchestrators/skypilot_base_vm_orchestrator.py
133 134 135 136 137 |
|
Modules
skypilot_base_vm_orchestrator
Implementation of the Skypilot base VM orchestrator.
SkypilotBaseOrchestrator(name: str, id: UUID, config: StackComponentConfig, flavor: str, type: StackComponentType, user: Optional[UUID], workspace: UUID, created: datetime, updated: datetime, labels: Optional[Dict[str, Any]] = None, connector_requirements: Optional[ServiceConnectorRequirements] = None, connector: Optional[UUID] = None, connector_resource_id: Optional[str] = None, *args: Any, **kwargs: Any)
Bases: ContainerizedOrchestrator
Base class for Orchestrator responsible for running pipelines remotely in a VM.
This orchestrator does not support running on a schedule.
Source code in src/zenml/stack/stack_component.py
328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 |
|
cloud: sky.clouds.Cloud
abstractmethod
property
The type of sky cloud to use.
Returns:
Type | Description |
---|---|
Cloud
|
A |
config: SkypilotBaseOrchestratorConfig
property
Returns the SkypilotBaseOrchestratorConfig
config.
Returns:
Type | Description |
---|---|
SkypilotBaseOrchestratorConfig
|
The configuration. |
validator: Optional[StackValidator]
property
Validates the stack.
In the remote case, checks that the stack contains a container registry, image builder and only remote components.
Returns:
Type | Description |
---|---|
Optional[StackValidator]
|
A |
get_orchestrator_run_id() -> str
Returns the active orchestrator run id.
Raises:
Type | Description |
---|---|
RuntimeError
|
If the environment variable specifying the run id is not set. |
Returns:
Type | Description |
---|---|
str
|
The orchestrator run id. |
Source code in src/zenml/integrations/skypilot/orchestrators/skypilot_base_vm_orchestrator.py
97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 |
|
prepare_environment_variable(set: bool = True) -> None
abstractmethod
Set up Environment variables that are required for the orchestrator.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
set
|
bool
|
Whether to set the environment variables or not. |
True
|
Source code in src/zenml/integrations/skypilot/orchestrators/skypilot_base_vm_orchestrator.py
139 140 141 142 143 144 145 |
|
prepare_or_run_pipeline(deployment: PipelineDeploymentResponse, stack: Stack, environment: Dict[str, str]) -> Any
Runs each pipeline step in a separate Skypilot container.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
deployment
|
PipelineDeploymentResponse
|
The pipeline deployment to prepare or run. |
required |
stack
|
Stack
|
The stack the pipeline will run on. |
required |
environment
|
Dict[str, str]
|
Environment variables to set in the orchestration environment. |
required |
Raises:
Type | Description |
---|---|
Exception
|
If the pipeline run fails. |
RuntimeError
|
If the code is running in a notebook. |
Source code in src/zenml/integrations/skypilot/orchestrators/skypilot_base_vm_orchestrator.py
147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 |
|
sanitize_cluster_name(name: str) -> str
Sanitize the value to be used in a cluster name.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name
|
str
|
Arbitrary input cluster name. |
required |
Returns:
Type | Description |
---|---|
str
|
Sanitized cluster name. |
Source code in src/zenml/integrations/skypilot/orchestrators/skypilot_base_vm_orchestrator.py
363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 |
|
setup_credentials() -> None
Set up credentials for the orchestrator.
Source code in src/zenml/integrations/skypilot/orchestrators/skypilot_base_vm_orchestrator.py
133 134 135 136 137 |
|
skypilot_orchestrator_entrypoint
Entrypoint of the Skypilot master/orchestrator VM.
main() -> None
Entrypoint of the Skypilot master/orchestrator VM.
This is the entrypoint of the Skypilot master/orchestrator VM. It is responsible for provisioning the VM and running the pipeline steps in separate VMs.
The VM is provisioned using the sky
library. The pipeline steps are run
using the sky
library as well.
Raises:
Type | Description |
---|---|
TypeError
|
If the active stack's orchestrator is not an instance of SkypilotBaseOrchestrator. |
ValueError
|
If the active stack's container registry is None. |
Source code in src/zenml/integrations/skypilot/orchestrators/skypilot_orchestrator_entrypoint.py
54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 |
|
parse_args() -> argparse.Namespace
Parse entrypoint arguments.
Returns:
Type | Description |
---|---|
Namespace
|
Parsed args. |
Source code in src/zenml/integrations/skypilot/orchestrators/skypilot_orchestrator_entrypoint.py
42 43 44 45 46 47 48 49 50 51 |
|
skypilot_orchestrator_entrypoint_configuration
Entrypoint configuration for the Skypilot master/orchestrator VM.
SkypilotOrchestratorEntrypointConfiguration
Entrypoint configuration for the Skypilot master/orchestrator VM.
get_entrypoint_arguments(run_name: str, deployment_id: UUID) -> List[str]
classmethod
Gets all arguments that the entrypoint command should be called with.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
run_name
|
str
|
Name of the ZenML run. |
required |
deployment_id
|
UUID
|
ID of the deployment. |
required |
Returns:
Type | Description |
---|---|
List[str]
|
List of entrypoint arguments. |
Source code in src/zenml/integrations/skypilot/orchestrators/skypilot_orchestrator_entrypoint_configuration.py
55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 |
|
get_entrypoint_command() -> List[str]
classmethod
Returns a command that runs the entrypoint module.
Returns:
Type | Description |
---|---|
List[str]
|
Entrypoint command. |
Source code in src/zenml/integrations/skypilot/orchestrators/skypilot_orchestrator_entrypoint_configuration.py
41 42 43 44 45 46 47 48 49 50 51 52 53 |
|
get_entrypoint_options() -> Set[str]
classmethod
Gets all the options required for running this entrypoint.
Returns:
Type | Description |
---|---|
Set[str]
|
Entrypoint options. |
Source code in src/zenml/integrations/skypilot/orchestrators/skypilot_orchestrator_entrypoint_configuration.py
28 29 30 31 32 33 34 35 36 37 38 39 |
|