Support OpenAI responses API

Both vLLM and OpenAI documentation talk about how to use the vLLM support of the responses API however I already faced an error trying to connect a client to my runpod serverless because the worker doesn't support responses, check the documentation found below

Source: https://docs.vllm.ai/projects/recipes/en/latest/OpenAI/GPT-OSS.html
Fragment: 
```
Usage

Once the vllm serve runs and INFO: Application startup complete has been displayed, you can send requests using HTTP request or OpenAI SDK to the following endpoints:

    /v1/responses endpoint can perform tool use (browsing, python, mcp) in between chain-of-thought and deliver a final response. This endpoint leverages the openai-harmony library for input rendering and output parsing. Stateful operation and full streaming API are work in progress. Responses API is recommended by OpenAI as the way to interact with this model.
```

Source: https://cookbook.openai.com/articles/gpt-oss/run-vllm
Fragment:
```
Create a model response
post https://api.openai.com/v1/responses

Creates a model response. Provide text or image inputs to generate text or JSON outputs. Have the model call your own custom code or use built-in tools like web search or file search to use your own data as input for the model's response.
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support OpenAI responses API #231

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Support OpenAI responses API #231

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions