Parameters
Following is available model parameters for completion.
Completion using API
Required Parameters
model (str, list[str])* Required
Specifies the model to use for text generation. Can be a HuggingFace or OpenRouter model
messages (list[dict])* Required
messages to send to model. OpenAI format required: {"role", "content"}
Optional Parameters
frequency_penalty (float)* Optional, -2.0 to 2.0
* Default = 0.0
Reduces repetitive text by applying a penalty to tokens based on how frequently they've appeared in the generated text.
logit_bias (list[float])* Optional, -100 to 100
* Default = None
Modifies token generation probability by directly adjusting their logit scores. Useful for controlling specific token appearances.
logprobs (bool)* Optional
* Default = False
Enables the return of log probabilities for generated tokens, useful for analyzing model confidence.
max_tokens (int)* Optional
* Default = 20
Limits the length of the model's response by setting a maximum token count.
presence_penalty (float)* Optional, -2.0 to 2.0
* Default = None
Influences topic diversity by penalizing tokens based on their presence in the text so far.
response_format (PydanticModel)* Optional
Defines structural constraints for the output. Pydantic model is required.
seed (int)* Optional
* Default = None
Enables reproducible outputs by setting a fixed random seed for generation.
stop (str)* Optional
* Default = None
Defines up to 4 sequences that will cause the model to stop generating further tokens when encountered.
temperature (float)* Optional, 0.0 to 2.0
* Default = 1.0
Controls response randomness - lower values produce more focused and deterministic outputs, while higher values increase creativity.
top_logprobs (int)* Optional, 0 to 5
Returns the most probable token alternatives at each position with their probabilities when logprobs is enabled.
top_p (float)* Optional, 0.0 to 1.0
* Default = 1.0
Controls response diversity by sampling from the most likely tokens that sum to the specified probability.
tool_choice (str)* Optional
* Default = "auto"
Determines which tool the model should use for completion generation.
tool_prompt (str)* Optional
* Only available for HuggingFace models
Provides additional context or instructions to be prepended before tool-related prompts.
A list of functions that the model may use to generate structured outputs
pref_params (list[dict[str, float]])* Optional
Extra parameters to control the model outputs from the model list except the first one . Currently supports temperature
and frequency_penalty
Completion using vLLM
Below is the list of parameters for vLLM configuration and its model.
LLM Engine
LLM configuration
Chat Parameter
Chat
Sampling Parameters
Sampling Parameters