Using the UI on RunPod for deploying a serverless vLLM instance:
Default value is 0,95
Typing 0,92 e.g. sets automatically 1
I think there should be set a "step" or the issue is at RunPod itself interpreting the json
...
{ "key": "GPU_MEMORY_UTILIZATION", "input": { "name": "GPU Memory Utilization", "type": "number", "description": "Sets GPU VRAM utilization", "default": 0.9, "step": 0.01 } }
https://github.com/runpod-workers/worker-vllm/blob/main/.runpod/hub.json
Using the UI on RunPod for deploying a serverless vLLM instance:
Default value is 0,95
Typing 0,92 e.g. sets automatically 1
I think there should be set a "step" or the issue is at RunPod itself interpreting the json
...
{ "key": "GPU_MEMORY_UTILIZATION", "input": { "name": "GPU Memory Utilization", "type": "number", "description": "Sets GPU VRAM utilization", "default": 0.9, "step": 0.01 } }https://github.com/runpod-workers/worker-vllm/blob/main/.runpod/hub.json