requests

NexosAPIRequest

Bases: NullableBaseModel

Base class for all API requests to the NEXOS API. This class serves as a foundation for defining specific API request models. It can be extended to create more specific request models for different API endpoints.

ChatCompletionsRequest

Bases: NexosAPIRequest

Request model for the Nexos.ai Chat Completions API.

Use this model to serialize the HTTP request body for the chat completions endpoint. Fields mirror the public API and include validation where applicable.

Attributes:

Name	Type	Description
`model`	`str`	The model ID to use for this completion (e.g., "6948fe4d-98ce-4f36-bc49-5f652cc07b65").
`messages`	`list[ChatMessage]`	The ordered conversation so far (min length: 1). Depending on the model, different message modalities are supported (e.g., text, images, audio). Common roles include: - Developer/System: instructions the model should follow. With o1 models and newer, prefer a developer message instead of system. - User: end-user prompts or context. - Assistant: prior model responses. - Tool / Function: tool results routed back to the model.
`store`	`bool \| None`	Whether to store the output of this request.
`metadata`	`dict[str, str] \| None`	Developer-defined tags and values used for filtering completions in the dashboard.
`frequency_penalty`	`float`	Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the likelihood of verbatim repetition. Default: 0.
`logit_bias`	`dict[str, float] \| None`	Per-token bias added to the model's logits. Keys are tokenizer token IDs (as strings) and values are in [-100, 100]. Small magnitudes tweak likelihood; large magnitudes can effectively ban or force tokens.
`logprobs`	`bool \| None`	Whether to return log probabilities of the output tokens. If true, returns log probabilities for each output token in the message content.
`top_logprobs`	`int \| None`	The number of most likely tokens to return at each token position, each with an associated log probability. Range: [0, 20]. Requires `logprobs == True`. Default: None.
`max_completion_tokens`	`int \| None`	Upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.
`n`	`int`	How many chat completion choices to generate for each input message. Range: [1, 128]. Costs scale with the number of generated tokens across all choices. Default: 1.
`modalities`	`list[Literal['text', 'audio']]`	Output types to generate. Most models support "text". To request both text and audio: ["text", "audio"].
`prediction`	`PredictionType \| None`	Configuration for a Predicted Output, which can improve response times when large parts of the response are known ahead of time (e.g., static content).
`presence_penalty`	`float`	Number between -2.0 and 2.0. Positive values penalize tokens that already appeared, nudging the model toward new topics. Default: 0.
`audio`	`AudioConfiguration \| None`	Parameters for audio output. Required when `modalities` includes "audio".
`response_format`	`dict[str, Any] \| None`	Constrains the output format. Supported values include: - {"type": "text"} - {"type": "json_object"} - {"type": "json_schema", "json_schema": {...}} (Structured Outputs) When using {"type": "json_object"}, also instruct the model (via messages) to produce JSON; otherwise it may stream whitespace until the token limit. Note: message content may be truncated if `finish_reason="length"`.
`seed`	`int \| None`	Best-effort deterministic sampling seed. Range: [-9223372036854776000, 9223372036854776000]. Determinism is not guaranteed; monitor changes via `system_fingerprint`.
`service_tier`	`Literal['auto', 'default']`	Latency tier selection. - "auto": Uses scale tier if enabled; otherwise default tier. - "default": Uses the default tier (lower uptime SLA, no latency guarantee). The response may include the service tier utilized. Default: "auto".
`stop`	`str \| list[str] \| None`	Up to 4 stop sequences at which token generation will halt.
`stream`	`bool \| None`	If true, sends partial message deltas as server-sent events (SSE). The stream terminates with: `data: [DONE]`.
`stream_options`	`dict[str, Any] \| None`	Options for streaming responses. Only set when `stream == True`.
`temperature`	`float`	Sampling temperature in [0, 2]. Higher values increase randomness; lower values increase determinism. Default: 1. Generally tune either `temperature` or `top_p`, not both.
`top_p`	`float`	Nucleus sampling probability mass in (0, 1]. For example, 0.1 considers only the tokens comprising the top 10% probability mass. Default: 1. Generally tune either `top_p` or `temperature`, not both.
`tools`	`list[dict[str, Any]] \| None`	A list of tools the model may call (max 128). Supported tool types: "function", "web_search", "rag", "tika_ocr". Provide tool-specific payloads under their respective keys.
`tool_choice`	`str \| dict[str, Any] \| None`	Controls tool invocation. - "none": do not call tools (generate a message instead) - "auto": model may choose to generate a message or call tools - "required": model must call one or more tools - Object to force a specific tool, e.g. {"type": "function", "function": {"name": "my_function"}} Defaults: "none" if no tools; "auto" if tools are present.
`parallel_tool_calls`	`bool \| None`	Whether to enable parallel function calling during tool use. API default: true.
`thinking`	`ChatThinkingModeConfiguration \| None`	Reasoning/thinking configuration. Common fields: - type (e.g., "enabled") - budget_tokens (e.g., 1024) Notes ----- - `max_tokens` is deprecated in favor of `max_completion_tokens` and is not compatible with o1-series models. - When requesting audio in `modalities`, also provide a valid `audio` configuration. - `top_logprobs` requires `logprobs == True`. - Costs scale with the number of generated tokens across all choices (`n`).

StorageDownloadRequest

Bases: NexosAPIRequest

Request to download a file from storage.

StorageGetRequest

Bases: NexosAPIRequest

Request to get metadata of a file from storage.

StorageDeleteRequest

Bases: NexosAPIRequest

Request to delete a file from storage.

TeamApiKeyCreateRequest

Bases: NexosAPIRequest

Request to create a new API key for a team.

TeamApiKeyUpdateRequest

Bases: NexosAPIRequest

Request to update an existing API key for a team.

TeamApiKeyDeleteRequest

Bases: NexosAPIRequest

Request to delete an API key for a team.

TeamApiKeyRegenerateRequest

Bases: NexosAPIRequest

Request to regenerate an API key for a team. This request does not require any additional parameters. It simply triggers the regeneration of the API key.

ModelsListRequest

Bases: NexosAPIRequest

Request to list available models.