Vllm
zenml.integrations.vllm
Initialization for the ZenML vLLM integration.
Attributes
VLLM = 'vllm'
module-attribute
VLLM_MODEL_DEPLOYER = 'vllm'
module-attribute
logger = get_logger(__name__)
module-attribute
Classes
Flavor
Class for ZenML Flavors.
Attributes
config_class: Type[StackComponentConfig]
abstractmethod
property
Returns StackComponentConfig
config class.
Returns:
Type | Description |
---|---|
Type[StackComponentConfig]
|
The config class. |
config_schema: Dict[str, Any]
property
The config schema for a flavor.
Returns:
Type | Description |
---|---|
Dict[str, Any]
|
The config schema. |
docs_url: Optional[str]
property
A url to point at docs explaining this flavor.
Returns:
Type | Description |
---|---|
Optional[str]
|
A flavor docs url. |
implementation_class: Type[StackComponent]
abstractmethod
property
Implementation class for this flavor.
Returns:
Type | Description |
---|---|
Type[StackComponent]
|
The implementation class for this flavor. |
logo_url: Optional[str]
property
A url to represent the flavor in the dashboard.
Returns:
Type | Description |
---|---|
Optional[str]
|
The flavor logo. |
name: str
abstractmethod
property
The flavor name.
Returns:
Type | Description |
---|---|
str
|
The flavor name. |
sdk_docs_url: Optional[str]
property
A url to point at SDK docs explaining this flavor.
Returns:
Type | Description |
---|---|
Optional[str]
|
A flavor SDK docs url. |
service_connector_requirements: Optional[ServiceConnectorRequirements]
property
Service connector resource requirements for service connectors.
Specifies resource requirements that are used to filter the available service connector types that are compatible with this flavor.
Returns:
Type | Description |
---|---|
Optional[ServiceConnectorRequirements]
|
Requirements for compatible service connectors, if a service |
Optional[ServiceConnectorRequirements]
|
connector is required for this flavor. |
type: StackComponentType
abstractmethod
property
Functions
from_model(flavor_model: FlavorResponse) -> Flavor
classmethod
Loads a flavor from a model.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
flavor_model
|
FlavorResponse
|
The model to load from. |
required |
Raises:
Type | Description |
---|---|
CustomFlavorImportError
|
If the custom flavor can't be imported. |
ImportError
|
If the flavor can't be imported. |
Returns:
Type | Description |
---|---|
Flavor
|
The loaded flavor. |
Source code in src/zenml/stack/flavor.py
122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 |
|
generate_default_docs_url() -> str
Generate the doc urls for all inbuilt and integration flavors.
Note that this method is not going to be useful for custom flavors, which do not have any docs in the main zenml docs.
Returns:
Type | Description |
---|---|
str
|
The complete url to the zenml documentation |
Source code in src/zenml/stack/flavor.py
206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 |
|
generate_default_sdk_docs_url() -> str
Generate SDK docs url for a flavor.
Returns:
Type | Description |
---|---|
str
|
The complete url to the zenml SDK docs |
Source code in src/zenml/stack/flavor.py
232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 |
|
to_model(integration: Optional[str] = None, is_custom: bool = True) -> FlavorRequest
Converts a flavor to a model.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
integration
|
Optional[str]
|
The integration to use for the model. |
None
|
is_custom
|
bool
|
Whether the flavor is a custom flavor. |
True
|
Returns:
Type | Description |
---|---|
FlavorRequest
|
The model. |
Source code in src/zenml/stack/flavor.py
159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 |
|
Integration
Base class for integration in ZenML.
Functions
activate() -> None
classmethod
Abstract method to activate the integration.
Source code in src/zenml/integrations/integration.py
170 171 172 |
|
check_installation() -> bool
classmethod
Method to check whether the required packages are installed.
Returns:
Type | Description |
---|---|
bool
|
True if all required packages are installed, False otherwise. |
Source code in src/zenml/integrations/integration.py
65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 |
|
flavors() -> List[Type[Flavor]]
classmethod
Abstract method to declare new stack component flavors.
Returns:
Type | Description |
---|---|
List[Type[Flavor]]
|
A list of new stack component flavors. |
Source code in src/zenml/integrations/integration.py
174 175 176 177 178 179 180 181 |
|
get_requirements(target_os: Optional[str] = None) -> List[str]
classmethod
Method to get the requirements for the integration.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
target_os
|
Optional[str]
|
The target operating system to get the requirements for. |
None
|
Returns:
Type | Description |
---|---|
List[str]
|
A list of requirements. |
Source code in src/zenml/integrations/integration.py
135 136 137 138 139 140 141 142 143 144 145 |
|
get_uninstall_requirements(target_os: Optional[str] = None) -> List[str]
classmethod
Method to get the uninstall requirements for the integration.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
target_os
|
Optional[str]
|
The target operating system to get the requirements for. |
None
|
Returns:
Type | Description |
---|---|
List[str]
|
A list of requirements. |
Source code in src/zenml/integrations/integration.py
147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 |
|
plugin_flavors() -> List[Type[BasePluginFlavor]]
classmethod
Abstract method to declare new plugin flavors.
Returns:
Type | Description |
---|---|
List[Type[BasePluginFlavor]]
|
A list of new plugin flavors. |
Source code in src/zenml/integrations/integration.py
183 184 185 186 187 188 189 190 |
|
VLLMIntegration
Bases: Integration
Definition of vLLM integration for ZenML.
Functions
activate() -> None
classmethod
Activates the integration.
Source code in src/zenml/integrations/vllm/__init__.py
33 34 35 36 |
|
flavors() -> List[Type[Flavor]]
classmethod
Declare the stack component flavors for the vLLM integration.
Returns:
Type | Description |
---|---|
List[Type[Flavor]]
|
List of stack component flavors for this integration. |
Source code in src/zenml/integrations/vllm/__init__.py
38 39 40 41 42 43 44 45 46 47 |
|
Functions
get_logger(logger_name: str) -> logging.Logger
Main function to get logger name,.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
logger_name
|
str
|
Name of logger to initialize. |
required |
Returns:
Type | Description |
---|---|
Logger
|
A logger object. |
Source code in src/zenml/logger.py
171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 |
|
Modules
flavors
vLLM integration flavors.
Classes
VLLMModelDeployerConfig(warn_about_plain_text_secrets: bool = False, **kwargs: Any)
Bases: BaseModelDeployerConfig
Configuration for vLLM Inference model deployer.
Source code in src/zenml/stack/stack_component.py
61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 |
|
VLLMModelDeployerFlavor
Bases: BaseModelDeployerFlavor
vLLM model deployer flavor.
config_class: Type[VLLMModelDeployerConfig]
property
Returns VLLMModelDeployerConfig
config class.
Returns:
Type | Description |
---|---|
Type[VLLMModelDeployerConfig]
|
The config class. |
docs_url: Optional[str]
property
A url to point at docs explaining this flavor.
Returns:
Type | Description |
---|---|
Optional[str]
|
A flavor docs url. |
implementation_class: Type[VLLMModelDeployer]
property
Implementation class for this flavor.
Returns:
Type | Description |
---|---|
Type[VLLMModelDeployer]
|
The implementation class. |
logo_url: str
property
A url to represent the flavor in the dashboard.
Returns:
Type | Description |
---|---|
str
|
The flavor logo. |
name: str
property
Name of the flavor.
Returns:
Type | Description |
---|---|
str
|
The name of the flavor. |
sdk_docs_url: Optional[str]
property
A url to point at SDK docs explaining this flavor.
Returns:
Type | Description |
---|---|
Optional[str]
|
A flavor SDK docs url. |
Modules
vllm_model_deployer_flavor
vLLM model deployer flavor.
VLLMModelDeployerConfig(warn_about_plain_text_secrets: bool = False, **kwargs: Any)
Bases: BaseModelDeployerConfig
Configuration for vLLM Inference model deployer.
Source code in src/zenml/stack/stack_component.py
61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 |
|
VLLMModelDeployerFlavor
Bases: BaseModelDeployerFlavor
vLLM model deployer flavor.
config_class: Type[VLLMModelDeployerConfig]
property
Returns VLLMModelDeployerConfig
config class.
Returns:
Type | Description |
---|---|
Type[VLLMModelDeployerConfig]
|
The config class. |
docs_url: Optional[str]
property
A url to point at docs explaining this flavor.
Returns:
Type | Description |
---|---|
Optional[str]
|
A flavor docs url. |
implementation_class: Type[VLLMModelDeployer]
property
Implementation class for this flavor.
Returns:
Type | Description |
---|---|
Type[VLLMModelDeployer]
|
The implementation class. |
logo_url: str
property
A url to represent the flavor in the dashboard.
Returns:
Type | Description |
---|---|
str
|
The flavor logo. |
name: str
property
Name of the flavor.
Returns:
Type | Description |
---|---|
str
|
The name of the flavor. |
sdk_docs_url: Optional[str]
property
A url to point at SDK docs explaining this flavor.
Returns:
Type | Description |
---|---|
Optional[str]
|
A flavor SDK docs url. |
model_deployers
Initialization of the vLLM model deployers.
Classes
VLLMModelDeployer(name: str, id: UUID, config: StackComponentConfig, flavor: str, type: StackComponentType, user: Optional[UUID], created: datetime, updated: datetime, labels: Optional[Dict[str, Any]] = None, connector_requirements: Optional[ServiceConnectorRequirements] = None, connector: Optional[UUID] = None, connector_resource_id: Optional[str] = None, *args: Any, **kwargs: Any)
Bases: BaseModelDeployer
vLLM Inference Server.
Source code in src/zenml/stack/stack_component.py
328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 |
|
config: VLLMModelDeployerConfig
property
Returns the VLLMModelDeployerConfig
config.
Returns:
Type | Description |
---|---|
VLLMModelDeployerConfig
|
The configuration. |
local_path: str
property
Returns the path to the root directory.
This is where all configurations for vLLM deployment daemon processes are stored.
If the service path is not set in the config by the user, the path is set to a local default path according to the component ID.
Returns:
Type | Description |
---|---|
str
|
The path to the local service root directory. |
get_model_server_info(service_instance: VLLMDeploymentService) -> Dict[str, Optional[str]]
staticmethod
Return implementation specific information on the model server.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
service_instance
|
VLLMDeploymentService
|
vLLM deployment service object |
required |
Returns:
Type | Description |
---|---|
Dict[str, Optional[str]]
|
A dictionary containing the model server information. |
Source code in src/zenml/integrations/vllm/model_deployers/vllm_model_deployer.py
100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 |
|
get_service_path(id_: UUID) -> str
staticmethod
Get the path where local vLLM service information is stored.
This includes the deployment service configuration, PID and log files are stored.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
id_
|
UUID
|
The ID of the vLLM model deployer. |
required |
Returns:
Type | Description |
---|---|
str
|
The service path. |
Source code in src/zenml/integrations/vllm/model_deployers/vllm_model_deployer.py
56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 |
|
perform_delete_model(service: BaseService, timeout: int = DEFAULT_SERVICE_START_STOP_TIMEOUT, force: bool = False) -> None
Method to delete all configuration of a model server.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
service
|
BaseService
|
The service to delete. |
required |
timeout
|
int
|
Timeout in seconds to wait for the service to stop. |
DEFAULT_SERVICE_START_STOP_TIMEOUT
|
force
|
bool
|
If True, force the service to stop. |
False
|
Source code in src/zenml/integrations/vllm/model_deployers/vllm_model_deployer.py
247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 |
|
perform_deploy_model(id: UUID, config: ServiceConfig, timeout: int = DEFAULT_SERVICE_START_STOP_TIMEOUT) -> BaseService
Create a new vLLM deployment service or update an existing one.
This should serve the supplied model and deployment configuration.
This method has two modes of operation, depending on the replace
argument value:
-
if
replace
is False, calling this method will create a new vLLM deployment server to reflect the model and other configuration parameters specified in the supplied vLLM serviceconfig
. -
if
replace
is True, this method will first attempt to find an existing vLLM deployment service that is equivalent to the supplied configuration parameters. Two or more vLLM deployment services are considered equivalent if they have the samepipeline_name
,pipeline_step_name
andmodel_name
configuration parameters. To put it differently, two vLLM deployment services are equivalent if they serve versions of the same model deployed by the same pipeline step. If an equivalent vLLM deployment is found, it will be updated in place to reflect the new configuration parameters.
Callers should set replace
to True if they want a continuous model
deployment workflow that doesn't spin up a new vLLM deployment
server for each new model version. If multiple equivalent vLLM
deployment servers are found, one is selected at random to be updated
and the others are deleted.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
id
|
UUID
|
the UUID of the vLLM model deployer. |
required |
config
|
ServiceConfig
|
the configuration of the model to be deployed with vLLM. |
required |
timeout
|
int
|
the timeout in seconds to wait for the vLLM server to be provisioned and successfully started or updated. If set to 0, the method will return immediately after the vLLM server is provisioned, without waiting for it to fully start. |
DEFAULT_SERVICE_START_STOP_TIMEOUT
|
Returns:
Type | Description |
---|---|
BaseService
|
The ZenML vLLM deployment service object that can be used to |
BaseService
|
interact with the vLLM model http server. |
Source code in src/zenml/integrations/vllm/model_deployers/vllm_model_deployer.py
119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 |
|
perform_start_model(service: BaseService, timeout: int = DEFAULT_SERVICE_START_STOP_TIMEOUT) -> BaseService
Method to start a model server.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
service
|
BaseService
|
The service to start. |
required |
timeout
|
int
|
Timeout in seconds to wait for the service to start. |
DEFAULT_SERVICE_START_STOP_TIMEOUT
|
Returns:
Type | Description |
---|---|
BaseService
|
The started service. |
Source code in src/zenml/integrations/vllm/model_deployers/vllm_model_deployer.py
230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 |
|
perform_stop_model(service: BaseService, timeout: int = DEFAULT_SERVICE_START_STOP_TIMEOUT, force: bool = False) -> BaseService
Method to stop a model server.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
service
|
BaseService
|
The service to stop. |
required |
timeout
|
int
|
Timeout in seconds to wait for the service to stop. |
DEFAULT_SERVICE_START_STOP_TIMEOUT
|
force
|
bool
|
If True, force the service to stop. |
False
|
Returns:
Type | Description |
---|---|
BaseService
|
The stopped service. |
Source code in src/zenml/integrations/vllm/model_deployers/vllm_model_deployer.py
211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 |
|
Modules
vllm_model_deployer
Implementation of the vLLM Model Deployer.
VLLMModelDeployer(name: str, id: UUID, config: StackComponentConfig, flavor: str, type: StackComponentType, user: Optional[UUID], created: datetime, updated: datetime, labels: Optional[Dict[str, Any]] = None, connector_requirements: Optional[ServiceConnectorRequirements] = None, connector: Optional[UUID] = None, connector_resource_id: Optional[str] = None, *args: Any, **kwargs: Any)
Bases: BaseModelDeployer
vLLM Inference Server.
Source code in src/zenml/stack/stack_component.py
328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 |
|
config: VLLMModelDeployerConfig
property
Returns the VLLMModelDeployerConfig
config.
Returns:
Type | Description |
---|---|
VLLMModelDeployerConfig
|
The configuration. |
local_path: str
property
Returns the path to the root directory.
This is where all configurations for vLLM deployment daemon processes are stored.
If the service path is not set in the config by the user, the path is set to a local default path according to the component ID.
Returns:
Type | Description |
---|---|
str
|
The path to the local service root directory. |
get_model_server_info(service_instance: VLLMDeploymentService) -> Dict[str, Optional[str]]
staticmethod
Return implementation specific information on the model server.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
service_instance
|
VLLMDeploymentService
|
vLLM deployment service object |
required |
Returns:
Type | Description |
---|---|
Dict[str, Optional[str]]
|
A dictionary containing the model server information. |
Source code in src/zenml/integrations/vllm/model_deployers/vllm_model_deployer.py
100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 |
|
get_service_path(id_: UUID) -> str
staticmethod
Get the path where local vLLM service information is stored.
This includes the deployment service configuration, PID and log files are stored.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
id_
|
UUID
|
The ID of the vLLM model deployer. |
required |
Returns:
Type | Description |
---|---|
str
|
The service path. |
Source code in src/zenml/integrations/vllm/model_deployers/vllm_model_deployer.py
56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 |
|
perform_delete_model(service: BaseService, timeout: int = DEFAULT_SERVICE_START_STOP_TIMEOUT, force: bool = False) -> None
Method to delete all configuration of a model server.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
service
|
BaseService
|
The service to delete. |
required |
timeout
|
int
|
Timeout in seconds to wait for the service to stop. |
DEFAULT_SERVICE_START_STOP_TIMEOUT
|
force
|
bool
|
If True, force the service to stop. |
False
|
Source code in src/zenml/integrations/vllm/model_deployers/vllm_model_deployer.py
247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 |
|
perform_deploy_model(id: UUID, config: ServiceConfig, timeout: int = DEFAULT_SERVICE_START_STOP_TIMEOUT) -> BaseService
Create a new vLLM deployment service or update an existing one.
This should serve the supplied model and deployment configuration.
This method has two modes of operation, depending on the replace
argument value:
-
if
replace
is False, calling this method will create a new vLLM deployment server to reflect the model and other configuration parameters specified in the supplied vLLM serviceconfig
. -
if
replace
is True, this method will first attempt to find an existing vLLM deployment service that is equivalent to the supplied configuration parameters. Two or more vLLM deployment services are considered equivalent if they have the samepipeline_name
,pipeline_step_name
andmodel_name
configuration parameters. To put it differently, two vLLM deployment services are equivalent if they serve versions of the same model deployed by the same pipeline step. If an equivalent vLLM deployment is found, it will be updated in place to reflect the new configuration parameters.
Callers should set replace
to True if they want a continuous model
deployment workflow that doesn't spin up a new vLLM deployment
server for each new model version. If multiple equivalent vLLM
deployment servers are found, one is selected at random to be updated
and the others are deleted.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
id
|
UUID
|
the UUID of the vLLM model deployer. |
required |
config
|
ServiceConfig
|
the configuration of the model to be deployed with vLLM. |
required |
timeout
|
int
|
the timeout in seconds to wait for the vLLM server to be provisioned and successfully started or updated. If set to 0, the method will return immediately after the vLLM server is provisioned, without waiting for it to fully start. |
DEFAULT_SERVICE_START_STOP_TIMEOUT
|
Returns:
Type | Description |
---|---|
BaseService
|
The ZenML vLLM deployment service object that can be used to |
BaseService
|
interact with the vLLM model http server. |
Source code in src/zenml/integrations/vllm/model_deployers/vllm_model_deployer.py
119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 |
|
perform_start_model(service: BaseService, timeout: int = DEFAULT_SERVICE_START_STOP_TIMEOUT) -> BaseService
Method to start a model server.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
service
|
BaseService
|
The service to start. |
required |
timeout
|
int
|
Timeout in seconds to wait for the service to start. |
DEFAULT_SERVICE_START_STOP_TIMEOUT
|
Returns:
Type | Description |
---|---|
BaseService
|
The started service. |
Source code in src/zenml/integrations/vllm/model_deployers/vllm_model_deployer.py
230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 |
|
perform_stop_model(service: BaseService, timeout: int = DEFAULT_SERVICE_START_STOP_TIMEOUT, force: bool = False) -> BaseService
Method to stop a model server.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
service
|
BaseService
|
The service to stop. |
required |
timeout
|
int
|
Timeout in seconds to wait for the service to stop. |
DEFAULT_SERVICE_START_STOP_TIMEOUT
|
force
|
bool
|
If True, force the service to stop. |
False
|
Returns:
Type | Description |
---|---|
BaseService
|
The stopped service. |
Source code in src/zenml/integrations/vllm/model_deployers/vllm_model_deployer.py
211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 |
|
services
Initialization of the vLLM Inference Server.
Classes
Modules
vllm_deployment
Implementation of the vLLM Inference Server Service.
VLLMDeploymentEndpoint(*args: Any, **kwargs: Any)
Bases: LocalDaemonServiceEndpoint
A service endpoint exposed by the vLLM deployment daemon.
Attributes:
Name | Type | Description |
---|---|---|
config |
VLLMDeploymentEndpointConfig
|
service endpoint configuration |
Source code in src/zenml/services/service_endpoint.py
111 112 113 114 115 116 117 118 119 120 121 122 123 |
|
prediction_url: Optional[str]
property
Gets the prediction URL for the endpoint.
Returns:
Type | Description |
---|---|
Optional[str]
|
the prediction URL for the endpoint |
VLLMDeploymentEndpointConfig
Bases: LocalDaemonServiceEndpointConfig
vLLM deployment service configuration.
Attributes:
Name | Type | Description |
---|---|---|
prediction_url_path |
str
|
URI subpath for prediction requests |
VLLMDeploymentService(config: VLLMServiceConfig, **attrs: Any)
Bases: LocalDaemonService
, BaseDeploymentService
vLLM Inference Server Deployment Service.
Initialize the vLLM deployment service.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
config
|
VLLMServiceConfig
|
service configuration |
required |
attrs
|
Any
|
additional attributes to set on the service |
{}
|
Source code in src/zenml/integrations/vllm/services/vllm_deployment.py
107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 |
|
prediction_url: Optional[str]
property
Gets the prediction URL for the endpoint.
Returns:
Type | Description |
---|---|
Optional[str]
|
the prediction URL for the endpoint |
predict(data: Any) -> Any
Make a prediction using the service.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data
|
Any
|
data to make a prediction on |
required |
Returns:
Type | Description |
---|---|
Any
|
The prediction result. |
Raises:
Type | Description |
---|---|
Exception
|
if the service is not running |
ValueError
|
if the prediction endpoint is unknown. |
Source code in src/zenml/integrations/vllm/services/vllm_deployment.py
180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 |
|
run() -> None
Start the service.
Source code in src/zenml/integrations/vllm/services/vllm_deployment.py
131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 |
|
VLLMServiceConfig(**data: Any)
Bases: LocalDaemonServiceConfig
vLLM service configurations.
Source code in src/zenml/services/service.py
131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 |
|