[Doc] Convert list tables to MyST (#11594)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
This commit is contained in:
parent
4fb8e329fd
commit
32b4c63f02
@ -197,4 +197,4 @@ if __name__ == '__main__':
|
|||||||
## Known Issues
|
## Known Issues
|
||||||
|
|
||||||
- In `v0.5.2`, `v0.5.3`, and `v0.5.3.post1`, there is a bug caused by [zmq](https://github.com/zeromq/pyzmq/issues/2000) , which can occasionally cause vLLM to hang depending on the machine configuration. The solution is to upgrade to the latest version of `vllm` to include the [fix](gh-pr:6759).
|
- In `v0.5.2`, `v0.5.3`, and `v0.5.3.post1`, there is a bug caused by [zmq](https://github.com/zeromq/pyzmq/issues/2000) , which can occasionally cause vLLM to hang depending on the machine configuration. The solution is to upgrade to the latest version of `vllm` to include the [fix](gh-pr:6759).
|
||||||
- To circumvent a NCCL [bug](https://github.com/NVIDIA/nccl/issues/1234) , all vLLM processes will set an environment variable ``NCCL_CUMEM_ENABLE=0`` to disable NCCL's ``cuMem`` allocator. It does not affect performance but only gives memory benefits. When external processes want to set up a NCCL connection with vLLM's processes, they should also set this environment variable, otherwise, inconsistent environment setup will cause NCCL to hang or crash, as observed in the [RLHF integration](https://github.com/OpenRLHF/OpenRLHF/pull/604) and the [discussion](gh-issue:5723#issuecomment-2554389656) .
|
- To circumvent a NCCL [bug](https://github.com/NVIDIA/nccl/issues/1234) , all vLLM processes will set an environment variable `NCCL_CUMEM_ENABLE=0` to disable NCCL's `cuMem` allocator. It does not affect performance but only gives memory benefits. When external processes want to set up a NCCL connection with vLLM's processes, they should also set this environment variable, otherwise, inconsistent environment setup will cause NCCL to hang or crash, as observed in the [RLHF integration](https://github.com/OpenRLHF/OpenRLHF/pull/604) and the [discussion](gh-issue:5723#issuecomment-2554389656) .
|
||||||
|
@ -141,26 +141,25 @@ Gaudi2 devices. Configurations that are not listed may or may not work.
|
|||||||
|
|
||||||
Currently in vLLM for HPU we support four execution modes, depending on selected HPU PyTorch Bridge backend (via `PT_HPU_LAZY_MODE` environment variable), and `--enforce-eager` flag.
|
Currently in vLLM for HPU we support four execution modes, depending on selected HPU PyTorch Bridge backend (via `PT_HPU_LAZY_MODE` environment variable), and `--enforce-eager` flag.
|
||||||
|
|
||||||
```{eval-rst}
|
```{list-table} vLLM execution modes
|
||||||
.. list-table:: vLLM execution modes
|
:widths: 25 25 50
|
||||||
:widths: 25 25 50
|
:header-rows: 1
|
||||||
:header-rows: 1
|
|
||||||
|
|
||||||
* - ``PT_HPU_LAZY_MODE``
|
* - `PT_HPU_LAZY_MODE`
|
||||||
- ``enforce_eager``
|
- `enforce_eager`
|
||||||
- execution mode
|
- execution mode
|
||||||
* - 0
|
* - 0
|
||||||
- 0
|
- 0
|
||||||
- torch.compile
|
- torch.compile
|
||||||
* - 0
|
* - 0
|
||||||
- 1
|
- 1
|
||||||
- PyTorch eager mode
|
- PyTorch eager mode
|
||||||
* - 1
|
* - 1
|
||||||
- 0
|
- 0
|
||||||
- HPU Graphs
|
- HPU Graphs
|
||||||
* - 1
|
* - 1
|
||||||
- 1
|
- 1
|
||||||
- PyTorch lazy mode
|
- PyTorch lazy mode
|
||||||
```
|
```
|
||||||
|
|
||||||
```{warning}
|
```{warning}
|
||||||
|
@ -68,33 +68,32 @@ gcloud alpha compute tpus queued-resources create QUEUED_RESOURCE_ID \
|
|||||||
--service-account SERVICE_ACCOUNT
|
--service-account SERVICE_ACCOUNT
|
||||||
```
|
```
|
||||||
|
|
||||||
```{eval-rst}
|
```{list-table} Parameter descriptions
|
||||||
.. list-table:: Parameter descriptions
|
:header-rows: 1
|
||||||
:header-rows: 1
|
|
||||||
|
|
||||||
* - Parameter name
|
* - Parameter name
|
||||||
- Description
|
- Description
|
||||||
* - QUEUED_RESOURCE_ID
|
* - QUEUED_RESOURCE_ID
|
||||||
- The user-assigned ID of the queued resource request.
|
- The user-assigned ID of the queued resource request.
|
||||||
* - TPU_NAME
|
* - TPU_NAME
|
||||||
- The user-assigned name of the TPU which is created when the queued
|
- The user-assigned name of the TPU which is created when the queued
|
||||||
resource request is allocated.
|
resource request is allocated.
|
||||||
* - PROJECT_ID
|
* - PROJECT_ID
|
||||||
- Your Google Cloud project
|
- Your Google Cloud project
|
||||||
* - ZONE
|
* - ZONE
|
||||||
- The GCP zone where you want to create your Cloud TPU. The value you use
|
- The GCP zone where you want to create your Cloud TPU. The value you use
|
||||||
depends on the version of TPUs you are using. For more information, see
|
depends on the version of TPUs you are using. For more information, see
|
||||||
`TPU regions and zones <https://cloud.google.com/tpu/docs/regions-zones>`_
|
`TPU regions and zones <https://cloud.google.com/tpu/docs/regions-zones>`_
|
||||||
* - ACCELERATOR_TYPE
|
* - ACCELERATOR_TYPE
|
||||||
- The TPU version you want to use. Specify the TPU version, for example
|
- The TPU version you want to use. Specify the TPU version, for example
|
||||||
`v5litepod-4` specifies a v5e TPU with 4 cores. For more information,
|
`v5litepod-4` specifies a v5e TPU with 4 cores. For more information,
|
||||||
see `TPU versions <https://cloud.devsite.corp.google.com/tpu/docs/system-architecture-tpu-vm#versions>`_.
|
see `TPU versions <https://cloud.devsite.corp.google.com/tpu/docs/system-architecture-tpu-vm#versions>`_.
|
||||||
* - RUNTIME_VERSION
|
* - RUNTIME_VERSION
|
||||||
- The TPU VM runtime version to use. For more information see `TPU VM images <https://cloud.google.com/tpu/docs/runtimes>`_.
|
- The TPU VM runtime version to use. For more information see `TPU VM images <https://cloud.google.com/tpu/docs/runtimes>`_.
|
||||||
* - SERVICE_ACCOUNT
|
* - SERVICE_ACCOUNT
|
||||||
- The email address for your service account. You can find it in the IAM
|
- The email address for your service account. You can find it in the IAM
|
||||||
Cloud Console under *Service Accounts*. For example:
|
Cloud Console under *Service Accounts*. For example:
|
||||||
`tpu-service-account@<your_project_ID>.iam.gserviceaccount.com`
|
`tpu-service-account@<your_project_ID>.iam.gserviceaccount.com`
|
||||||
```
|
```
|
||||||
|
|
||||||
Connect to your TPU using SSH:
|
Connect to your TPU using SSH:
|
||||||
|
File diff suppressed because it is too large
Load Diff
@ -4,121 +4,120 @@
|
|||||||
|
|
||||||
The table below shows the compatibility of various quantization implementations with different hardware platforms in vLLM:
|
The table below shows the compatibility of various quantization implementations with different hardware platforms in vLLM:
|
||||||
|
|
||||||
```{eval-rst}
|
```{list-table}
|
||||||
.. list-table::
|
:header-rows: 1
|
||||||
:header-rows: 1
|
:widths: 20 8 8 8 8 8 8 8 8 8 8
|
||||||
:widths: 20 8 8 8 8 8 8 8 8 8 8
|
|
||||||
|
|
||||||
* - Implementation
|
* - Implementation
|
||||||
- Volta
|
- Volta
|
||||||
- Turing
|
- Turing
|
||||||
- Ampere
|
- Ampere
|
||||||
- Ada
|
- Ada
|
||||||
- Hopper
|
- Hopper
|
||||||
- AMD GPU
|
- AMD GPU
|
||||||
- Intel GPU
|
- Intel GPU
|
||||||
- x86 CPU
|
- x86 CPU
|
||||||
- AWS Inferentia
|
- AWS Inferentia
|
||||||
- Google TPU
|
- Google TPU
|
||||||
* - AWQ
|
* - AWQ
|
||||||
- ✗
|
- ✗
|
||||||
- ✅︎
|
- ✅︎
|
||||||
- ✅︎
|
- ✅︎
|
||||||
- ✅︎
|
- ✅︎
|
||||||
- ✅︎
|
- ✅︎
|
||||||
- ✗
|
- ✗
|
||||||
- ✅︎
|
- ✅︎
|
||||||
- ✅︎
|
- ✅︎
|
||||||
- ✗
|
- ✗
|
||||||
- ✗
|
- ✗
|
||||||
* - GPTQ
|
* - GPTQ
|
||||||
- ✅︎
|
- ✅︎
|
||||||
- ✅︎
|
- ✅︎
|
||||||
- ✅︎
|
- ✅︎
|
||||||
- ✅︎
|
- ✅︎
|
||||||
- ✅︎
|
- ✅︎
|
||||||
- ✗
|
- ✗
|
||||||
- ✅︎
|
- ✅︎
|
||||||
- ✅︎
|
- ✅︎
|
||||||
- ✗
|
- ✗
|
||||||
- ✗
|
- ✗
|
||||||
* - Marlin (GPTQ/AWQ/FP8)
|
* - Marlin (GPTQ/AWQ/FP8)
|
||||||
- ✗
|
- ✗
|
||||||
- ✗
|
- ✗
|
||||||
- ✅︎
|
- ✅︎
|
||||||
- ✅︎
|
- ✅︎
|
||||||
- ✅︎
|
- ✅︎
|
||||||
- ✗
|
- ✗
|
||||||
- ✗
|
- ✗
|
||||||
- ✗
|
- ✗
|
||||||
- ✗
|
- ✗
|
||||||
- ✗
|
- ✗
|
||||||
* - INT8 (W8A8)
|
* - INT8 (W8A8)
|
||||||
- ✗
|
- ✗
|
||||||
- ✅︎
|
- ✅︎
|
||||||
- ✅︎
|
- ✅︎
|
||||||
- ✅︎
|
- ✅︎
|
||||||
- ✅︎
|
- ✅︎
|
||||||
- ✗
|
- ✗
|
||||||
- ✗
|
- ✗
|
||||||
- ✅︎
|
- ✅︎
|
||||||
- ✗
|
- ✗
|
||||||
- ✗
|
- ✗
|
||||||
* - FP8 (W8A8)
|
* - FP8 (W8A8)
|
||||||
- ✗
|
- ✗
|
||||||
- ✗
|
- ✗
|
||||||
- ✗
|
- ✗
|
||||||
- ✅︎
|
- ✅︎
|
||||||
- ✅︎
|
- ✅︎
|
||||||
- ✅︎
|
- ✅︎
|
||||||
- ✗
|
- ✗
|
||||||
- ✗
|
- ✗
|
||||||
- ✗
|
- ✗
|
||||||
- ✗
|
- ✗
|
||||||
* - AQLM
|
* - AQLM
|
||||||
- ✅︎
|
- ✅︎
|
||||||
- ✅︎
|
- ✅︎
|
||||||
- ✅︎
|
- ✅︎
|
||||||
- ✅︎
|
- ✅︎
|
||||||
- ✅︎
|
- ✅︎
|
||||||
- ✗
|
- ✗
|
||||||
- ✗
|
- ✗
|
||||||
- ✗
|
- ✗
|
||||||
- ✗
|
- ✗
|
||||||
- ✗
|
- ✗
|
||||||
* - bitsandbytes
|
* - bitsandbytes
|
||||||
- ✅︎
|
- ✅︎
|
||||||
- ✅︎
|
- ✅︎
|
||||||
- ✅︎
|
- ✅︎
|
||||||
- ✅︎
|
- ✅︎
|
||||||
- ✅︎
|
- ✅︎
|
||||||
- ✗
|
- ✗
|
||||||
- ✗
|
- ✗
|
||||||
- ✗
|
- ✗
|
||||||
- ✗
|
- ✗
|
||||||
- ✗
|
- ✗
|
||||||
* - DeepSpeedFP
|
* - DeepSpeedFP
|
||||||
- ✅︎
|
- ✅︎
|
||||||
- ✅︎
|
- ✅︎
|
||||||
- ✅︎
|
- ✅︎
|
||||||
- ✅︎
|
- ✅︎
|
||||||
- ✅︎
|
- ✅︎
|
||||||
- ✗
|
- ✗
|
||||||
- ✗
|
- ✗
|
||||||
- ✗
|
- ✗
|
||||||
- ✗
|
- ✗
|
||||||
- ✗
|
- ✗
|
||||||
* - GGUF
|
* - GGUF
|
||||||
- ✅︎
|
- ✅︎
|
||||||
- ✅︎
|
- ✅︎
|
||||||
- ✅︎
|
- ✅︎
|
||||||
- ✅︎
|
- ✅︎
|
||||||
- ✅︎
|
- ✅︎
|
||||||
- ✗
|
- ✗
|
||||||
- ✗
|
- ✗
|
||||||
- ✗
|
- ✗
|
||||||
- ✗
|
- ✗
|
||||||
- ✗
|
- ✗
|
||||||
```
|
```
|
||||||
|
|
||||||
## Notes:
|
## Notes:
|
||||||
|
@ -43,209 +43,208 @@ chart **including persistent volumes** and deletes the release.
|
|||||||
|
|
||||||
## Values
|
## Values
|
||||||
|
|
||||||
```{eval-rst}
|
```{list-table}
|
||||||
.. list-table:: Values
|
:widths: 25 25 25 25
|
||||||
:widths: 25 25 25 25
|
:header-rows: 1
|
||||||
:header-rows: 1
|
|
||||||
|
|
||||||
* - Key
|
* - Key
|
||||||
- Type
|
- Type
|
||||||
- Default
|
- Default
|
||||||
- Description
|
- Description
|
||||||
* - autoscaling
|
* - autoscaling
|
||||||
- object
|
- object
|
||||||
- {"enabled":false,"maxReplicas":100,"minReplicas":1,"targetCPUUtilizationPercentage":80}
|
- {"enabled":false,"maxReplicas":100,"minReplicas":1,"targetCPUUtilizationPercentage":80}
|
||||||
- Autoscaling configuration
|
- Autoscaling configuration
|
||||||
* - autoscaling.enabled
|
* - autoscaling.enabled
|
||||||
- bool
|
- bool
|
||||||
- false
|
- false
|
||||||
- Enable autoscaling
|
- Enable autoscaling
|
||||||
* - autoscaling.maxReplicas
|
* - autoscaling.maxReplicas
|
||||||
- int
|
- int
|
||||||
- 100
|
- 100
|
||||||
- Maximum replicas
|
- Maximum replicas
|
||||||
* - autoscaling.minReplicas
|
* - autoscaling.minReplicas
|
||||||
- int
|
- int
|
||||||
- 1
|
- 1
|
||||||
- Minimum replicas
|
- Minimum replicas
|
||||||
* - autoscaling.targetCPUUtilizationPercentage
|
* - autoscaling.targetCPUUtilizationPercentage
|
||||||
- int
|
- int
|
||||||
- 80
|
- 80
|
||||||
- Target CPU utilization for autoscaling
|
- Target CPU utilization for autoscaling
|
||||||
* - configs
|
* - configs
|
||||||
- object
|
- object
|
||||||
- {}
|
- {}
|
||||||
- Configmap
|
- Configmap
|
||||||
* - containerPort
|
* - containerPort
|
||||||
- int
|
- int
|
||||||
- 8000
|
- 8000
|
||||||
- Container port
|
- Container port
|
||||||
* - customObjects
|
* - customObjects
|
||||||
- list
|
- list
|
||||||
- []
|
- []
|
||||||
- Custom Objects configuration
|
- Custom Objects configuration
|
||||||
* - deploymentStrategy
|
* - deploymentStrategy
|
||||||
- object
|
- object
|
||||||
- {}
|
- {}
|
||||||
- Deployment strategy configuration
|
- Deployment strategy configuration
|
||||||
* - externalConfigs
|
* - externalConfigs
|
||||||
- list
|
- list
|
||||||
- []
|
- []
|
||||||
- External configuration
|
- External configuration
|
||||||
* - extraContainers
|
* - extraContainers
|
||||||
- list
|
- list
|
||||||
- []
|
- []
|
||||||
- Additional containers configuration
|
- Additional containers configuration
|
||||||
* - extraInit
|
* - extraInit
|
||||||
- object
|
- object
|
||||||
- {"pvcStorage":"1Gi","s3modelpath":"relative_s3_model_path/opt-125m", "awsEc2MetadataDisabled": true}
|
- {"pvcStorage":"1Gi","s3modelpath":"relative_s3_model_path/opt-125m", "awsEc2MetadataDisabled": true}
|
||||||
- Additional configuration for the init container
|
- Additional configuration for the init container
|
||||||
* - extraInit.pvcStorage
|
* - extraInit.pvcStorage
|
||||||
- string
|
- string
|
||||||
- "50Gi"
|
- "50Gi"
|
||||||
- Storage size of the s3
|
- Storage size of the s3
|
||||||
* - extraInit.s3modelpath
|
* - extraInit.s3modelpath
|
||||||
- string
|
- string
|
||||||
- "relative_s3_model_path/opt-125m"
|
- "relative_s3_model_path/opt-125m"
|
||||||
- Path of the model on the s3 which hosts model weights and config files
|
- Path of the model on the s3 which hosts model weights and config files
|
||||||
* - extraInit.awsEc2MetadataDisabled
|
* - extraInit.awsEc2MetadataDisabled
|
||||||
- boolean
|
- boolean
|
||||||
- true
|
- true
|
||||||
- Disables the use of the Amazon EC2 instance metadata service
|
- Disables the use of the Amazon EC2 instance metadata service
|
||||||
* - extraPorts
|
* - extraPorts
|
||||||
- list
|
- list
|
||||||
- []
|
- []
|
||||||
- Additional ports configuration
|
- Additional ports configuration
|
||||||
* - gpuModels
|
* - gpuModels
|
||||||
- list
|
- list
|
||||||
- ["TYPE_GPU_USED"]
|
- ["TYPE_GPU_USED"]
|
||||||
- Type of gpu used
|
- Type of gpu used
|
||||||
* - image
|
* - image
|
||||||
- object
|
- object
|
||||||
- {"command":["vllm","serve","/data/","--served-model-name","opt-125m","--host","0.0.0.0","--port","8000"],"repository":"vllm/vllm-openai","tag":"latest"}
|
- {"command":["vllm","serve","/data/","--served-model-name","opt-125m","--host","0.0.0.0","--port","8000"],"repository":"vllm/vllm-openai","tag":"latest"}
|
||||||
- Image configuration
|
- Image configuration
|
||||||
* - image.command
|
* - image.command
|
||||||
- list
|
- list
|
||||||
- ["vllm","serve","/data/","--served-model-name","opt-125m","--host","0.0.0.0","--port","8000"]
|
- ["vllm","serve","/data/","--served-model-name","opt-125m","--host","0.0.0.0","--port","8000"]
|
||||||
- Container launch command
|
- Container launch command
|
||||||
* - image.repository
|
* - image.repository
|
||||||
- string
|
- string
|
||||||
- "vllm/vllm-openai"
|
- "vllm/vllm-openai"
|
||||||
- Image repository
|
- Image repository
|
||||||
* - image.tag
|
* - image.tag
|
||||||
- string
|
- string
|
||||||
- "latest"
|
- "latest"
|
||||||
- Image tag
|
- Image tag
|
||||||
* - livenessProbe
|
* - livenessProbe
|
||||||
- object
|
- object
|
||||||
- {"failureThreshold":3,"httpGet":{"path":"/health","port":8000},"initialDelaySeconds":15,"periodSeconds":10}
|
- {"failureThreshold":3,"httpGet":{"path":"/health","port":8000},"initialDelaySeconds":15,"periodSeconds":10}
|
||||||
- Liveness probe configuration
|
- Liveness probe configuration
|
||||||
* - livenessProbe.failureThreshold
|
* - livenessProbe.failureThreshold
|
||||||
- int
|
- int
|
||||||
- 3
|
- 3
|
||||||
- Number of times after which if a probe fails in a row, Kubernetes considers that the overall check has failed: the container is not alive
|
- Number of times after which if a probe fails in a row, Kubernetes considers that the overall check has failed: the container is not alive
|
||||||
* - livenessProbe.httpGet
|
* - livenessProbe.httpGet
|
||||||
- object
|
- object
|
||||||
- {"path":"/health","port":8000}
|
- {"path":"/health","port":8000}
|
||||||
- Configuration of the Kubelet http request on the server
|
- Configuration of the Kubelet http request on the server
|
||||||
* - livenessProbe.httpGet.path
|
* - livenessProbe.httpGet.path
|
||||||
- string
|
- string
|
||||||
- "/health"
|
- "/health"
|
||||||
- Path to access on the HTTP server
|
- Path to access on the HTTP server
|
||||||
* - livenessProbe.httpGet.port
|
* - livenessProbe.httpGet.port
|
||||||
- int
|
- int
|
||||||
- 8000
|
- 8000
|
||||||
- Name or number of the port to access on the container, on which the server is listening
|
- Name or number of the port to access on the container, on which the server is listening
|
||||||
* - livenessProbe.initialDelaySeconds
|
* - livenessProbe.initialDelaySeconds
|
||||||
- int
|
- int
|
||||||
- 15
|
- 15
|
||||||
- Number of seconds after the container has started before liveness probe is initiated
|
- Number of seconds after the container has started before liveness probe is initiated
|
||||||
* - livenessProbe.periodSeconds
|
* - livenessProbe.periodSeconds
|
||||||
- int
|
- int
|
||||||
- 10
|
- 10
|
||||||
- How often (in seconds) to perform the liveness probe
|
- How often (in seconds) to perform the liveness probe
|
||||||
* - maxUnavailablePodDisruptionBudget
|
* - maxUnavailablePodDisruptionBudget
|
||||||
- string
|
- string
|
||||||
- ""
|
- ""
|
||||||
- Disruption Budget Configuration
|
- Disruption Budget Configuration
|
||||||
* - readinessProbe
|
* - readinessProbe
|
||||||
- object
|
- object
|
||||||
- {"failureThreshold":3,"httpGet":{"path":"/health","port":8000},"initialDelaySeconds":5,"periodSeconds":5}
|
- {"failureThreshold":3,"httpGet":{"path":"/health","port":8000},"initialDelaySeconds":5,"periodSeconds":5}
|
||||||
- Readiness probe configuration
|
- Readiness probe configuration
|
||||||
* - readinessProbe.failureThreshold
|
* - readinessProbe.failureThreshold
|
||||||
- int
|
- int
|
||||||
- 3
|
- 3
|
||||||
- Number of times after which if a probe fails in a row, Kubernetes considers that the overall check has failed: the container is not ready
|
- Number of times after which if a probe fails in a row, Kubernetes considers that the overall check has failed: the container is not ready
|
||||||
* - readinessProbe.httpGet
|
* - readinessProbe.httpGet
|
||||||
- object
|
- object
|
||||||
- {"path":"/health","port":8000}
|
- {"path":"/health","port":8000}
|
||||||
- Configuration of the Kubelet http request on the server
|
- Configuration of the Kubelet http request on the server
|
||||||
* - readinessProbe.httpGet.path
|
* - readinessProbe.httpGet.path
|
||||||
- string
|
- string
|
||||||
- "/health"
|
- "/health"
|
||||||
- Path to access on the HTTP server
|
- Path to access on the HTTP server
|
||||||
* - readinessProbe.httpGet.port
|
* - readinessProbe.httpGet.port
|
||||||
- int
|
- int
|
||||||
- 8000
|
- 8000
|
||||||
- Name or number of the port to access on the container, on which the server is listening
|
- Name or number of the port to access on the container, on which the server is listening
|
||||||
* - readinessProbe.initialDelaySeconds
|
* - readinessProbe.initialDelaySeconds
|
||||||
- int
|
- int
|
||||||
- 5
|
- 5
|
||||||
- Number of seconds after the container has started before readiness probe is initiated
|
- Number of seconds after the container has started before readiness probe is initiated
|
||||||
* - readinessProbe.periodSeconds
|
* - readinessProbe.periodSeconds
|
||||||
- int
|
- int
|
||||||
- 5
|
- 5
|
||||||
- How often (in seconds) to perform the readiness probe
|
- How often (in seconds) to perform the readiness probe
|
||||||
* - replicaCount
|
* - replicaCount
|
||||||
- int
|
- int
|
||||||
- 1
|
- 1
|
||||||
- Number of replicas
|
- Number of replicas
|
||||||
* - resources
|
* - resources
|
||||||
- object
|
- object
|
||||||
- {"limits":{"cpu":4,"memory":"16Gi","nvidia.com/gpu":1},"requests":{"cpu":4,"memory":"16Gi","nvidia.com/gpu":1}}
|
- {"limits":{"cpu":4,"memory":"16Gi","nvidia.com/gpu":1},"requests":{"cpu":4,"memory":"16Gi","nvidia.com/gpu":1}}
|
||||||
- Resource configuration
|
- Resource configuration
|
||||||
* - resources.limits."nvidia.com/gpu"
|
* - resources.limits."nvidia.com/gpu"
|
||||||
- int
|
- int
|
||||||
- 1
|
- 1
|
||||||
- Number of gpus used
|
- Number of gpus used
|
||||||
* - resources.limits.cpu
|
* - resources.limits.cpu
|
||||||
- int
|
- int
|
||||||
- 4
|
- 4
|
||||||
- Number of CPUs
|
- Number of CPUs
|
||||||
* - resources.limits.memory
|
* - resources.limits.memory
|
||||||
- string
|
- string
|
||||||
- "16Gi"
|
- "16Gi"
|
||||||
- CPU memory configuration
|
- CPU memory configuration
|
||||||
* - resources.requests."nvidia.com/gpu"
|
* - resources.requests."nvidia.com/gpu"
|
||||||
- int
|
- int
|
||||||
- 1
|
- 1
|
||||||
- Number of gpus used
|
- Number of gpus used
|
||||||
* - resources.requests.cpu
|
* - resources.requests.cpu
|
||||||
- int
|
- int
|
||||||
- 4
|
- 4
|
||||||
- Number of CPUs
|
- Number of CPUs
|
||||||
* - resources.requests.memory
|
* - resources.requests.memory
|
||||||
- string
|
- string
|
||||||
- "16Gi"
|
- "16Gi"
|
||||||
- CPU memory configuration
|
- CPU memory configuration
|
||||||
* - secrets
|
* - secrets
|
||||||
- object
|
- object
|
||||||
- {}
|
- {}
|
||||||
- Secrets configuration
|
- Secrets configuration
|
||||||
* - serviceName
|
* - serviceName
|
||||||
- string
|
- string
|
||||||
-
|
-
|
||||||
- Service name
|
- Service name
|
||||||
* - servicePort
|
* - servicePort
|
||||||
- int
|
- int
|
||||||
- 80
|
- 80
|
||||||
- Service port
|
- Service port
|
||||||
* - labels.environment
|
* - labels.environment
|
||||||
- string
|
- string
|
||||||
- test
|
- test
|
||||||
- Environment name
|
- Environment name
|
||||||
* - labels.release
|
* - labels.release
|
||||||
- string
|
- string
|
||||||
- test
|
- test
|
||||||
- Release name
|
- Release name
|
||||||
```
|
```
|
||||||
|
Loading…
x
Reference in New Issue
Block a user