Skip to content

requests

NexosAPIRequest

Bases: NullableBaseModel

Base class for all API requests to the NEXOS API. This class serves as a foundation for defining specific API request models. It can be extended to create more specific request models for different API endpoints.

ChatCompletionsRequest

Bases: NexosAPIRequest

Request model for the Nexos.ai Chat Completions API.

Use this model to serialize the HTTP request body for the chat completions endpoint. Fields mirror the public API and include validation where applicable.

Attributes:

Name Type Description
model str

The model ID to use for this completion (e.g., "6948fe4d-98ce-4f36-bc49-5f652cc07b65").

messages list[ChatMessage]

The ordered conversation so far (min length: 1). Depending on the model, different message modalities are supported (e.g., text, images, audio). Common roles include: - Developer/System: instructions the model should follow. With o1 models and newer, prefer a developer message instead of system. - User: end-user prompts or context. - Assistant: prior model responses. - Tool / Function: tool results routed back to the model.

store bool | None

Whether to store the output of this request.

metadata dict[str, str] | None

Developer-defined tags and values used for filtering completions in the dashboard.

frequency_penalty float

Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the likelihood of verbatim repetition. Default: 0.

logit_bias dict[str, float] | None

Per-token bias added to the model's logits. Keys are tokenizer token IDs (as strings) and values are in [-100, 100]. Small magnitudes tweak likelihood; large magnitudes can effectively ban or force tokens.

logprobs bool | None

Whether to return log probabilities of the output tokens. If true, returns log probabilities for each output token in the message content.

top_logprobs int | None

The number of most likely tokens to return at each token position, each with an associated log probability. Range: [0, 20]. Requires logprobs == True. Default: None.

max_completion_tokens int | None

Upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

n int

How many chat completion choices to generate for each input message. Range: [1, 128]. Costs scale with the number of generated tokens across all choices. Default: 1.

modalities list[Literal['text', 'audio']]

Output types to generate. Most models support "text". To request both text and audio: ["text", "audio"].

prediction PredictionType | None

Configuration for a Predicted Output, which can improve response times when large parts of the response are known ahead of time (e.g., static content).

presence_penalty float

Number between -2.0 and 2.0. Positive values penalize tokens that already appeared, nudging the model toward new topics. Default: 0.

audio AudioConfiguration | None

Parameters for audio output. Required when modalities includes "audio".

response_format dict[str, Any] | None

Constrains the output format. Supported values include: - {"type": "text"} - {"type": "json_object"} - {"type": "json_schema", "json_schema": {...}} (Structured Outputs) When using {"type": "json_object"}, also instruct the model (via messages) to produce JSON; otherwise it may stream whitespace until the token limit. Note: message content may be truncated if finish_reason="length".

seed int | None

Best-effort deterministic sampling seed. Range: [-9223372036854776000, 9223372036854776000]. Determinism is not guaranteed; monitor changes via system_fingerprint.

service_tier Literal['auto', 'default']

Latency tier selection. - "auto": Uses scale tier if enabled; otherwise default tier. - "default": Uses the default tier (lower uptime SLA, no latency guarantee). The response may include the service tier utilized. Default: "auto".

stop str | list[str] | None

Up to 4 stop sequences at which token generation will halt.

stream bool | None

If true, sends partial message deltas as server-sent events (SSE). The stream terminates with: data: [DONE].

stream_options dict[str, Any] | None

Options for streaming responses. Only set when stream == True.

temperature float

Sampling temperature in [0, 2]. Higher values increase randomness; lower values increase determinism. Default: 1. Generally tune either temperature or top_p, not both.

top_p float

Nucleus sampling probability mass in (0, 1]. For example, 0.1 considers only the tokens comprising the top 10% probability mass. Default: 1. Generally tune either top_p or temperature, not both.

tools list[dict[str, Any]] | None

A list of tools the model may call (max 128). Supported tool types: "function", "web_search", "rag", "tika_ocr". Provide tool-specific payloads under their respective keys.

tool_choice str | dict[str, Any] | None

Controls tool invocation. - "none": do not call tools (generate a message instead) - "auto": model may choose to generate a message or call tools - "required": model must call one or more tools - Object to force a specific tool, e.g. {"type": "function", "function": {"name": "my_function"}} Defaults: "none" if no tools; "auto" if tools are present.

parallel_tool_calls bool | None

Whether to enable parallel function calling during tool use. API default: true.

thinking ChatThinkingModeConfiguration | None

Reasoning/thinking configuration. Common fields: - type (e.g., "enabled") - budget_tokens (e.g., 1024) Notes ----- - max_tokens is deprecated in favor of max_completion_tokens and is not compatible with o1-series models. - When requesting audio in modalities, also provide a valid audio configuration. - top_logprobs requires logprobs == True. - Costs scale with the number of generated tokens across all choices (n).

StorageDownloadRequest

Bases: NexosAPIRequest

Request to download a file from storage.

StorageGetRequest

Bases: NexosAPIRequest

Request to get metadata of a file from storage.

StorageDeleteRequest

Bases: NexosAPIRequest

Request to delete a file from storage.

TeamApiKeyCreateRequest

Bases: NexosAPIRequest

Request to create a new API key for a team.

TeamApiKeyUpdateRequest

Bases: NexosAPIRequest

Request to update an existing API key for a team.

TeamApiKeyDeleteRequest

Bases: NexosAPIRequest

Request to delete an API key for a team.

TeamApiKeyRegenerateRequest

Bases: NexosAPIRequest

Request to regenerate an API key for a team. This request does not require any additional parameters. It simply triggers the regeneration of the API key.

ModelsListRequest

Bases: NexosAPIRequest

Request to list available models.