Parameters

Following is available model parameters for completion.

Completion using API¶

model (str, list[str])

* Required

Specifies the model to use for text generation. Can be a HuggingFace or OpenRouter model

messages (list[dict])

* Required

messages to send to model. OpenAI format required: {"role", "content"}

frequency_penalty (float)

* Optional, -2.0 to 2.0
* Default = 0.0

Reduces repetitive text by applying a penalty to tokens based on how frequently they've appeared in the generated text.

logit_bias (list[float])

* Optional, -100 to 100
* Default = None

Modifies token generation probability by directly adjusting their logit scores. Useful for controlling specific token appearances.

logprobs (bool)

* Optional
* Default = False

Enables the return of log probabilities for generated tokens, useful for analyzing model confidence.

max_tokens (int)

* Optional
* Default = 20

Limits the length of the model's response by setting a maximum token count.

presence_penalty (float)

* Optional, -2.0 to 2.0
* Default = None

Influences topic diversity by penalizing tokens based on their presence in the text so far.

response_format (PydanticModel)

* Optional

Defines structural constraints for the output. Pydantic model is required.

seed (int)

* Optional
* Default = None

Enables reproducible outputs by setting a fixed random seed for generation.

stop (str)

* Optional
* Default = None

Defines up to 4 sequences that will cause the model to stop generating further tokens when encountered.

temperature (float)

* Optional, 0.0 to 2.0
* Default = 1.0

Controls response randomness - lower values produce more focused and deterministic outputs, while higher values increase creativity.

top_logprobs (int)

* Optional, 0 to 5

Returns the most probable token alternatives at each position with their probabilities when logprobs is enabled.

top_p (float)

* Optional, 0.0 to 1.0
* Default = 1.0

Controls response diversity by sampling from the most likely tokens that sum to the specified probability.

tool_choice (str)

* Optional
* Default = "auto"

Determines which tool the model should use for completion generation.

tool_prompt (str)

* Optional
* Only available for HuggingFace models

Provides additional context or instructions to be prepended before tool-related prompts.

tools (list)

* Optional

A list of functions that the model may use to generate structured outputs

pref_params (list[dict[str, float]])

* Optional

Extra parameters to control the model outputs from the model list except the first one . Currently supports temperature and frequency_penalty

Below is the list of parameters for vLLM configuration and its model.