Image

API Reference

Last updated: 11-10-2025

Nutanix Enterprise AI (2.5.0)

Download OpenAPI specification:Download

License: EULA

Introduction

Inference APIs

The Nutanix Enterprise AI (NAI) Inference API reference describes the RESTful and streaming APIs you can use to interact with the Inference Endpoints deployed on the NAI platform. The REST API is currently in version 1 (v1).

The best practice for ensuring consistent behaviour and output from a model is to use a pinned model version. Model behaviours tend to change between versions.

All API users should have valid API Key credentials to send API calls to the Inference Endpoint. The Inference APIs are compliant with OpenAI API specifications, where applicable. Nutanix has added new tags to support models from other model catalogue providers. You can invoke these APIs using the OpenAI API clients.

For more information on OpenAI Spec, see https://github.com/openai/openai-openapi/

Management APIs

The Nutanix Enterprise AI (NAI) Management API reference describes the RESTful APIs available to manage entities on the NAI platform. These APIs are currently in the experimental stage.

The API documentation is publicly accessible to all valid users without requiring special permissions, and is intended for viewing purposes only. This document covers the Inference and Management APIs available in the Nutanix Enterprise AI 2.5 release.

apikeys

create API key

create a new API key

Authorizations:
BasicAuth
Request Body schema: application/json
required

new API key create request object

endpoints
Array of strings
name
required
string
unifiedEndpoints
Array of strings

Used only when gateway mode is enabled

Responses

Request samples

curl --request POST \
    --url https://www.nutanix.dev//api/enterpriseai/v1/apikeys \
    --header 'Accept: application/json' \
    --header 'Authorization: Basic <basic_auth_token>' \
    --header 'Content-Type: application/json' \
    --data '{
    "endpoints": [
        "string"
    ],
    "name": "string",
    "unifiedEndpoints": [
        "string"
    ]
}'

Response samples

Content type
application/json
{
  • "data": {
    },
  • "msg": "string"
}

delete API key

delete API key with ID

Authorizations:
BasicAuth
path Parameters
apikey_id
required
string

API key ID

Responses

Request samples

curl --request DELETE \
    --url https://www.nutanix.dev//api/enterpriseai/v1/apikeys/{apikey_id} \
    --header 'Accept: application/json' \
    --header 'Authorization: Basic <basic_auth_token>' \
    --header 'Content-Type: application/json' \
     

Response samples

Content type
application/json
{
  • "msg": "string"
}

list API keys

list apikeys

Authorizations:
BasicAuth
query Parameters
limit
integer

limit

offset
integer

offset

owner_id
string

owner ID for which the API keys have to be returned

Responses

Request samples

curl --request GET \
    --url https://www.nutanix.dev//api/enterpriseai/v1/apikeys \
    --header 'Accept: application/json' \
    --header 'Authorization: Basic <basic_auth_token>' \
    --header 'Content-Type: application/json' \
     

Response samples

Content type
application/json
{
  • "data": {
    },
  • "msg": "string"
}

search API keys

search API keys

Authorizations:
BasicAuth
Request Body schema: application/json
required

list options object

required
Array of objects (dto.FilterOptions)
limit
required
integer
offset
required
integer >= 0
required
Array of objects (dto.SortOptions)

Responses

Request samples

curl --request POST \
    --url https://www.nutanix.dev//api/enterpriseai/v1/apikeys/search \
    --header 'Accept: application/json' \
    --header 'Authorization: Basic <basic_auth_token>' \
    --header 'Content-Type: application/json' \
    --data '{
    "filters": [
        {
            "field": "string",
            "operation": "startsWith",
            "values": [
                "string"
            ]
        }
    ],
    "limit": 0,
    "offset": 0,
    "sort": [
        {
            "field": "string",
            "order": "ASCENDING"
        }
    ]
}'

Response samples

Content type
application/json
{
  • "data": {
    },
  • "msg": "string"
}

update API key

update API key with status and endpoint ID list

Authorizations:
BasicAuth
path Parameters
apikey_id
required
string

API key ID

Request Body schema: application/json
required

API key update request object

endpoints
Array of strings
status
string
unifiedEndpoints
Array of strings

Used only when gateway mode is enabled

Responses

Request samples

curl --request PATCH \
    --url https://www.nutanix.dev//api/enterpriseai/v1/apikeys/{apikey_id} \
    --header 'Accept: application/json' \
    --header 'Authorization: Basic <basic_auth_token>' \
    --header 'Content-Type: application/json' \
    --data '{
    "endpoints": [
        "string"
    ],
    "status": "string",
    "unifiedEndpoints": [
        "string"
    ]
}'

Response samples

Content type
application/json
{
  • "data": {
    },
  • "msg": "string"
}

auditlogs

search audit logs

search audit logs

Authorizations:
BasicAuth
Request Body schema: application/json
required

list options object

required
Array of objects (dto.FilterOptions)
limit
required
integer
offset
required
integer >= 0
required
Array of objects (dto.SortOptions)

Responses

Request samples

curl --request POST \
    --url https://www.nutanix.dev//api/enterpriseai/v1/auditlogs \
    --header 'Accept: application/json' \
    --header 'Authorization: Basic <basic_auth_token>' \
    --header 'Content-Type: application/json' \
    --data '{
    "filters": [
        {
            "field": "string",
            "operation": "startsWith",
            "values": [
                "string"
            ]
        }
    ],
    "limit": 0,
    "offset": 0,
    "sort": [
        {
            "field": "string",
            "order": "ASCENDING"
        }
    ]
}'

Response samples

Content type
application/json
{
  • "data": {
    },
  • "msg": "string"
}

catalogs

catalog by ID

get a catalog by ID

Authorizations:
BasicAuth
path Parameters
catalog_id
required
string

catalog ID

Responses

Request samples

curl --request GET \
    --url https://www.nutanix.dev//api/enterpriseai/v1/catalogs/{catalog_id} \
    --header 'Accept: application/json' \
    --header 'Authorization: Basic <basic_auth_token>' \
    --header 'Content-Type: application/json' \
     

Response samples

Content type
application/json
{
  • "data": {
    },
  • "msg": "string"
}

create catalog

create a new catalog

Authorizations:
BasicAuth
Request Body schema: application/json
required

new create catalog request object

additionalProperties
object (dto.CatalogAdditionalProperties)
adminEnabled
required
boolean
contextLength
required
integer
description
required
string
developer
required
string
license
string
modelName
required
string

ModelName represents the HuggingFace RepoID

modelRevision
required
string
modelSizeInGB
required
integer
modelType
required
string (enum.ModelType)
Enum: "Text Generation" "Embedding" "Reranker" "Safety" "Vision" "Image Generation" "Image Classification" "Object Detection" "Custom"
modelUrl
required
string
quantization
required
string (enum.Quantization)
Enum: "float8" "float16" "bfloat16"
required
Array of objects (dto.CreateRuntimeRequest)
sourceHub
required
string (enum.CatalogSourceHub)
Enum: "HuggingFace" "NvidiaNIM"
tokenRequired
required
boolean
validated
required
boolean

Responses

Request samples

curl --request POST \
    --url https://www.nutanix.dev//api/enterpriseai/v1/catalogs \
    --header 'Accept: application/json' \
    --header 'Authorization: Basic <basic_auth_token>' \
    --header 'Content-Type: application/json' \
    --data '{
    "additionalProperties": {},
    "adminEnabled": true,
    "contextLength": 0,
    "description": "string",
    "developer": "string",
    "license": "string",
    "modelName": "string",
    "modelRevision": "string",
    "modelSizeInGB": 0,
    "modelType": "Text Generation",
    "modelUrl": "string",
    "quantization": "float8",
    "runtimes": [
        {
            "additionalProperties": {
                "additionalArgs": {
                    "hfTransformersArgs": {
                        "maxNumTokens": 0
                    },
                    "tgiArgs": {
                        "customArgs": {
                            "property1": "string",
                            "property2": "string"
                        },
                        "maxNumTokens": 0
                    },
                    "vllmArgs": {
                        "customArgs": {
                            "property1": "string",
                            "property2": "string"
                        },
                        "maxNumTokens": 0
                    }
                },
                "additionalEnvs": [
                    {
                        "name": "string",
                        "value": "string"
                    }
                ],
                "fetchNimFromHf": true,
                "nimGpuConfig": {
                    "property1": [
                        {
                            "contextLength": -1,
                            "gpuCount": 0
                        }
                    ],
                    "property2": [
                        {
                            "contextLength": -1,
                            "gpuCount": 0
                        }
                    ]
                },
                "tgiGpuConfig": {
                    "property1": [
                        {
                            "contextLength": -1,
                            "gpuCount": 0
                        }
                    ],
                    "property2": [
                        {
                            "contextLength": -1,
                            "gpuCount": 0
                        }
                    ]
                }
            },
            "image": "string",
            "modelCapabilities": [
                "string"
            ],
            "name": "vllm-nvidia-gpu-passthrough",
            "resources": {
                "acceleratorMemory": {
                    "kvCachePerToken": null,
                    "modelWeights": null,
                    "textActivationsPerToken": null,
                    "visionActivationsPerImagePerSeq": null
                },
                "cpu": 0,
                "ram": null
            }
        }
    ],
    "sourceHub": "HuggingFace",
    "tokenRequired": true,
    "validated": true
}'

Response samples

Content type
application/json
{
  • "data": {
    },
  • "msg": "string"
}

delete catalog

delete catalog by ID

Authorizations:
BasicAuth
path Parameters
catalog_id
required
string

catalog ID

Responses

Request samples

curl --request DELETE \
    --url https://www.nutanix.dev//api/enterpriseai/v1/catalogs/{catalog_id} \
    --header 'Accept: application/json' \
    --header 'Authorization: Basic <basic_auth_token>' \
    --header 'Content-Type: application/json' \
     

Response samples

Content type
application/json
{
  • "msg": "string"
}

list of catalogs

get catalogs by limit and offset

Authorizations:
BasicAuth
query Parameters
model_name
string

model name

deprecated
boolean

filter catalog items on deprecated field

source_hub
string
Enum: "HuggingFace" "NvidiaNIM"

filter catalog items on source hub

limit
integer

limit

offset
integer

offset

Responses

Request samples

curl --request GET \
    --url https://www.nutanix.dev//api/enterpriseai/v1/catalogs \
    --header 'Accept: application/json' \
    --header 'Authorization: Basic <basic_auth_token>' \
    --header 'Content-Type: application/json' \
     

Response samples

Content type
application/json
{
  • "data": {
    },
  • "msg": "string"
}

requirements of a catalog entry

get requirements of a catalog entry

Authorizations:
BasicAuth
Request Body schema: application/json
required

new catalog requirements object

acceleratorMemory
number >= 0
acceleratorProduct
string
engine
string (enum.BaseEngine)
Enum: "vllm" "tgi" "nim" "hf-transformers" "custom-model-server"
modelName
required
string
platform
required
string (enum.Platform)
Enum: "nvidia-gpu-passthrough" "nvidia-vgpu" "nvidia-mig" "amd-gpu-passthrough" "intel-amx-cpu" "cpu"

Responses

Request samples

curl --request POST \
    --url https://www.nutanix.dev//api/enterpriseai/v1/catalogs/requirements \
    --header 'Accept: application/json' \
    --header 'Authorization: Basic <basic_auth_token>' \
    --header 'Content-Type: application/json' \
    --data '{
    "acceleratorMemory": null,
    "acceleratorProduct": "string",
    "engine": "vllm",
    "modelName": "string",
    "platform": "nvidia-gpu-passthrough"
}'

Response samples

Content type
application/json
{
  • "data": {
    },
  • "msg": "string"
}

search catalogs

search catalogs

Authorizations:
BasicAuth
Request Body schema: application/json
required

list options object

required
Array of objects (dto.FilterOptions)
limit
required
integer
offset
required
integer >= 0
required
Array of objects (dto.SortOptions)

Responses

Request samples

curl --request POST \
    --url https://www.nutanix.dev//api/enterpriseai/v1/catalogs/search \
    --header 'Accept: application/json' \
    --header 'Authorization: Basic <basic_auth_token>' \
    --header 'Content-Type: application/json' \
    --data '{
    "filters": [
        {
            "field": "string",
            "operation": "startsWith",
            "values": [
                "string"
            ]
        }
    ],
    "limit": 0,
    "offset": 0,
    "sort": [
        {
            "field": "string",
            "order": "ASCENDING"
        }
    ]
}'

Response samples

Content type
application/json
{
  • "data": {
    },
  • "msg": "string"
}

Update catalog entries

Update one or more catalog entries.

Authorizations:
BasicAuth
Request Body schema: application/json
required

List of catalog entries to update

Array
adminEnabled
boolean
catalogId
required
string

Responses

Request samples

curl --request PATCH \
    --url https://www.nutanix.dev//api/enterpriseai/v1/catalogs \
    --header 'Accept: application/json' \
    --header 'Authorization: Basic <basic_auth_token>' \
    --header 'Content-Type: application/json' \
     

Response samples

Content type
application/json
{
  • "data": null,
  • "msg": "string",
  • "status": "Success"
}

cluster

get cluster health

get cluster health

Authorizations:
BasicAuth

Responses

Request samples

curl --request GET \
    --url https://www.nutanix.dev//api/enterpriseai/v1/cluster/health \
    --header 'Accept: application/json' \
    --header 'Authorization: Basic <basic_auth_token>' \
    --header 'Content-Type: application/json' \
     

Response samples

Content type
application/json
{
  • "data": {
    },
  • "msg": "string"
}

get cluster info

retrieves information about the kubernetes cluster

Authorizations:
BasicAuth

Responses

Request samples

curl --request GET \
    --url https://www.nutanix.dev//api/enterpriseai/v1/cluster/info \
    --header 'Accept: application/json' \
    --header 'Authorization: Basic <basic_auth_token>' \
     

Response samples

Content type
application/json
{
  • "data": {
    },
  • "msg": "string"
}

getClusterConfig

retrieves information about the Cluster Configurations

Authorizations:
BasicAuth

Responses

Request samples

curl --request GET \
    --url https://www.nutanix.dev//api/enterpriseai/v1/cluster/config \
    --header 'Accept: application/json' \
    --header 'Authorization: Basic <basic_auth_token>' \
     

Response samples

Content type
application/json
{
  • "data": {
    },
  • "msg": "string"
}

restore the cluster from database

restore the cluster from database

Authorizations:
BasicAuth

Responses

Request samples

curl --request POST \
    --url https://www.nutanix.dev//api/enterpriseai/v1/cluster/restore \
    --header 'Accept: application/json' \
    --header 'Authorization: Basic <basic_auth_token>' \
    --header 'Content-Type: application/json' \
     

Response samples

Content type
application/json
{
  • "data": null,
  • "msg": "string"
}

update cluster configuration status

update cluster configuration status

Authorizations:
BasicAuth
Request Body schema: application/json
required

new cluster config update request object

object (dto.EULAAcceptRequest)
object (dto.HfURLImportEnableUpdateRequest)
object (dto.ManualUploadEnableUpdateRequest)
object (dto.ProxySettings)
object (dto.PulseUpdateRequest)
object (dto.RsyslogServerInfo)

Responses

Request samples

curl --request PATCH \
    --url https://www.nutanix.dev//api/enterpriseai/v1/cluster/config \
    --header 'Accept: application/json' \
    --header 'Authorization: Basic <basic_auth_token>' \
    --header 'Content-Type: application/json' \
    --data '{
    "eula": {
        "accepted": true,
        "company": "string",
        "name": "string"
    },
    "hfUrlImport": {
        "enabled": true
    },
    "manualUpload": {
        "enabled": true
    },
    "proxySettings": {
        "name": "string",
        "password": "string",
        "port": 1,
        "protocols": [
            "string"
        ],
        "proxyAddress": "string",
        "username": "string"
    },
    "pulse": {
        "accepted": true
    },
    "rsyslogServer": {
        "logConfigs": [
            {
                "logSeverity": 0,
                "logType": "audit"
            }
        ],
        "name": "string",
        "port": 1,
        "protocol": "string",
        "serverAddress": "string"
    }
}'

Response samples

Content type
application/json
{
  • "data": null,
  • "msg": "string"
}

credentials

create credential

create a new credential

Authorizations:
BasicAuth
Request Body schema: application/json
required

new create credential request object

required
object
name
required
string
type
required
string (enum.CredentialType)
Enum: "hf" "ngc" "s3" "provider"

Responses

Request samples

curl --request POST \
    --url https://www.nutanix.dev//api/enterpriseai/v1/credentials \
    --header 'Accept: application/json' \
    --header 'Authorization: Basic <basic_auth_token>' \
    --header 'Content-Type: application/json' \
    --data '{
    "data": {
        "property1": "string",
        "property2": "string"
    },
    "name": "string",
    "type": "hf"
}'

Response samples

Content type
application/json
{
  • "data": {
    },
  • "msg": "string"
}

delete credential

delete an existing credential

Authorizations:
BasicAuth
path Parameters
credential_id
required
string

credential ID

query Parameters
force
boolean

force delete

Responses

Request samples

curl --request DELETE \
    --url https://www.nutanix.dev//api/enterpriseai/v1/credentials/{credential_id} \
    --header 'Accept: application/json' \
    --header 'Authorization: Basic <basic_auth_token>' \
    --header 'Content-Type: application/json' \
     

Response samples

Content type
application/json
{
  • "msg": "string"
}

get credential by ID

get an existing credential by ID

Authorizations:
BasicAuth
path Parameters
credential_id
required
string

credential ID

Responses

Request samples

curl --request GET \
    --url https://www.nutanix.dev//api/enterpriseai/v1/credentials/{credential_id} \
    --header 'Accept: application/json' \
    --header 'Authorization: Basic <basic_auth_token>' \
    --header 'Content-Type: application/json' \
     

Response samples

Content type
application/json
{
  • "data": {
    },
  • "msg": "string"
}

list credentials

list credentials

Authorizations:
BasicAuth
query Parameters
type
Array of strings

type(s) of the credential

limit
integer

limit

offset
integer

offset

Responses

Request samples

curl --request GET \
    --url https://www.nutanix.dev//api/enterpriseai/v1/credentials \
    --header 'Accept: application/json' \
    --header 'Authorization: Basic <basic_auth_token>' \
    --header 'Content-Type: application/json' \
     

Response samples

Content type
application/json
{
  • "data": {
    },
  • "msg": "string"
}

update credential

update an existing credential

Authorizations:
BasicAuth
path Parameters
credential_id
required
string

credential ID

Request Body schema: application/json
required

updated credential object

object
name
string

Responses

Request samples

curl --request PATCH \
    --url https://www.nutanix.dev//api/enterpriseai/v1/credentials/{credential_id} \
    --header 'Accept: application/json' \
    --header 'Authorization: Basic <basic_auth_token>' \
    --header 'Content-Type: application/json' \
    --data '{
    "data": {
        "property1": "string",
        "property2": "string"
    },
    "name": "string"
}'

Response samples

Content type
application/json
{
  • "msg": "string"
}

Data Plane

Create chat completion

The chat completions API generates a model response from a list of messages comprising a conversation. This request creates a model response for the given chat conversation.

header Parameters
Authorization
required
string
Default: Bearer <Add access token here>

API key for authentication

Request Body schema: application/json
required

Parameters required to create the chat completion request.

object

ChatTemplateKwargs provides a way to add non-standard parameters to the request body. Additional kwargs to pass to the template renderer. Will be accessible by the chat template. Such as think mode for qwen3. "chat_template_kwargs": {"enable_thinking": false} https://qwen.readthedocs.io/en/latest/deployment/vllm.html#thinking-non-thinking-modes

frequency_penalty
number
function_call
any

Deprecated: use ToolChoice instead.

Array of objects (openai.FunctionDefinition)

Deprecated: use Tools instead.

object

LogitBias is must be a token id string (specified by their token ID in the tokenizer), not a word string. incorrect: "logit_bias":{"You": 6}, correct: "logit_bias":{"1639": 6} refs: https://platform.openai.com/docs/api-reference/chat/create#chat/create-logit_bias

logprobs
boolean

LogProbs indicates whether to return log probabilities of the output tokens or not. If true, returns the log probabilities of each output token returned in the content of message. This option is currently not available on the gpt-4-vision-preview model.

max_completion_tokens
integer

MaxCompletionTokens An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens https://platform.openai.com/docs/guides/reasoning

max_tokens
integer

MaxTokens The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API. This value is now deprecated in favor of max_completion_tokens, and is not compatible with o1 series models. refs: https://platform.openai.com/docs/api-reference/chat/create#chat-create-max_tokens

Array of objects (openai.ChatCompletionMessage)
object

Metadata to store with the completion.

model
string
n
integer
parallel_tool_calls
any

Disable the default behavior of parallel tool calls by setting it: false.

object

Configuration for a predicted output.

presence_penalty
number
reasoning_effort
string

Controls effort on reasoning for reasoning models. It can be set to "low", "medium", or "high".

object (openai.ChatCompletionResponseFormat)
seed
integer
service_tier
string
Enum: "auto" "default" "flex" "priority"

Specifies the latency tier to use for processing the request.

stop
Array of strings
store
boolean

Store can be set to true to store the output of this completion request for use in distillations and evals. https://platform.openai.com/docs/api-reference/chat/create#chat-create-store

stream
boolean
object

Options for streaming response. Only set this when you set stream: true.

temperature
number
tool_choice
any

This can be either a string or an ToolChoice object.

Array of objects (openai.Tool)
top_logprobs
integer

TopLogProbs is an integer between 0 and 5 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to true if this parameter is used.

top_p
number
user
string

Responses

Request samples

curl --request POST \
    --url https://www.nutanix.dev//enterpriseai/v1/chat/completions \
    --header 'Accept: application/json' \
    --header 'Authorization: Bearer <Add access token here>' \
    --header 'Content-Type: application/json' \
    --data '{
    "chat_template_kwargs": {
        "property1": {},
        "property2": {}
    },
    "frequency_penalty": null,
    "function_call": null,
    "functions": [
        {
            "description": "string",
            "name": "string",
            "parameters": null,
            "strict": true
        }
    ],
    "logit_bias": {
        "property1": 0,
        "property2": 0
    },
    "logprobs": true,
    "max_completion_tokens": 0,
    "max_tokens": 0,
    "messages": [
        {
            "content": "string",
            "function_call": {
                "arguments": "string",
                "name": "string"
            },
            "multiContent": [
                {
                    "image_url": {
                        "detail": "high",
                        "url": "string"
                    },
                    "text": "string",
                    "type": "text"
                }
            ],
            "name": "string",
            "reasoning_content": "string",
            "refusal": "string",
            "role": "string",
            "tool_call_id": "string",
            "tool_calls": [
                {
                    "function": {
                        "arguments": "string",
                        "name": "string"
                    },
                    "id": "string",
                    "index": 0,
                    "type": "function"
                }
            ]
        }
    ],
    "metadata": {
        "property1": "string",
        "property2": "string"
    },
    "model": "string",
    "n": 0,
    "parallel_tool_calls": null,
    "prediction": null,
    "presence_penalty": null,
    "reasoning_effort": "string",
    "response_format": {
        "json_schema": {
            "description": "string",
            "name": "string",
            "schema": null,
            "strict": true
        },
        "type": "json_object"
    },
    "seed": 0,
    "service_tier": null,
    "stop": [
        "string"
    ],
    "store": true,
    "stream": true,
    "stream_options": null,
    "temperature": null,
    "tool_choice": null,
    "tools": [
        {
            "function": {
                "description": "string",
                "name": "string",
                "parameters": null,
                "strict": true
            },
            "type": "function"
        }
    ],
    "top_logprobs": 0,
    "top_p": null,
    "user": "string"
}'

Response samples

Content type
application/json
{
  • "choices": [
    ],
  • "created": 0,
  • "id": "string",
  • "model": "string",
  • "object": "string",
  • "prompt_filter_results": [
    ],
  • "service_tier": "auto",
  • "system_fingerprint": "string",
  • "usage": {
    }
}

Create embeddings

The embedding API generates a vector representation of a given input that can be easily consumed by machine learning models and algorithms. This request creates an embedding vector representing the input text.

header Parameters
Authorization
required
string
Default: Bearer <Add access token here>

API key for authentication

Request Body schema: application/json
required

Parameters required to generate a request.

dimensions
integer

Dimensions The number of dimensions the resulting output embeddings should have. Only supported in text-embedding-3 and later models.

encoding_format
string (openai.EmbeddingEncodingFormat)
Enum: "float" "base64"
input
any
input_type
string
model
string (openai.EmbeddingModel)
Enum: "text-similarity-ada-001" "text-similarity-babbage-001" "text-similarity-curie-001" "text-similarity-davinci-001" "text-search-ada-doc-001" "text-search-ada-query-001" "text-search-babbage-doc-001" "text-search-babbage-query-001" "text-search-curie-doc-001" "text-search-curie-query-001" "text-search-davinci-doc-001" "text-search-davinci-query-001" "code-search-ada-code-001" "code-search-ada-text-001" "code-search-babbage-code-001" "code-search-babbage-text-001" "text-embedding-ada-002" "text-embedding-3-small" "text-embedding-3-large"
truncate
string
user
string

Responses

Request samples

curl --request POST \
    --url https://www.nutanix.dev//enterpriseai/v1/embeddings \
    --header 'Accept: application/json' \
    --header 'Authorization: Bearer <Add access token here>' \
    --header 'Content-Type: application/json' \
    --data '{
    "dimensions": 0,
    "encoding_format": "float",
    "input": null,
    "input_type": "string",
    "model": "text-similarity-ada-001",
    "truncate": "string",
    "user": "string"
}'

Response samples

Content type
application/json
{
  • "data": [
    ],
  • "model": "text-similarity-ada-001",
  • "object": "string",
  • "usage": {
    }
}

Create image generation

The image generation API generates images from a text prompt.

header Parameters
Authorization
required
string
Default: Bearer <Add access token here>

API key for authentication

Request Body schema: application/json
required

Parameters required to create the image generation request

background
string
cfg_scale
number
inference_steps
integer
model
string
moderation
string
n
integer
output_compression
integer
output_format
string
prompt
string
quality
string
response_format
string
seed
integer
size
string
style
string
user
string

Responses

Request samples

curl --request POST \
    --url https://www.nutanix.dev//enterpriseai/v1/images/generations \
    --header 'Accept: application/json' \
    --header 'Authorization: Bearer <Add access token here>' \
    --header 'Content-Type: application/json' \
    --data '{
    "background": "string",
    "cfg_scale": null,
    "inference_steps": 0,
    "model": "string",
    "moderation": "string",
    "n": 0,
    "output_compression": 0,
    "output_format": "string",
    "prompt": "string",
    "quality": "string",
    "response_format": "string",
    "seed": 0,
    "size": "string",
    "style": "string",
    "user": "string"
}'

Response samples

Content type
application/json
{
  • "artifacts": [
    ],
  • "created": 0,
  • "data": [
    ],
  • "usage": {
    }
}

Create rerank

Rerank API turns user input into text chunks. Every chunk will include the query and a portion of the document text. Chunk size depends on the model.

header Parameters
Authorization
required
string
Default: Bearer <Add access token here>

API key for authentication

Request Body schema: application/json
required

Parameters required to create the rerank request

required
Array of objects (dto.Document)
model
required
string
query
required
string
return_documents
boolean
top_n
integer >= 0
truncate
boolean

Responses

Request samples

curl --request POST \
    --url https://www.nutanix.dev//enterpriseai/v1/rerank \
    --header 'Accept: application/json' \
    --header 'Authorization: Bearer <Add access token here>' \
    --header 'Content-Type: application/json' \
    --data '{
    "documents": [
        {
            "text": "string"
        }
    ],
    "model": "string",
    "query": "string",
    "return_documents": true,
    "top_n": 0,
    "truncate": true
}'

Response samples

Content type
application/json
{
  • "meta": {
    },
  • "results": [
    ]
}

List models

Lists the currently available models and provides basic information, such as the owner and availability

header Parameters
Authorization
required
string
Default: Bearer <Add access token here>

API key for authentication

Responses

Request samples

curl --request GET \
    --url https://www.nutanix.dev//enterpriseai/v1/models \
    --header 'Accept: application/json' \
    --header 'Authorization: Bearer <Add access token here>' \
    --header 'Content-Type: application/json' \
     

Response samples

Content type
application/json
{
  • "data": [
    ],
  • "object": "string"
}

endpoints

create endpoint

create a new endpoint

Authorizations:
BasicAuth
Request Body schema: application/json
required

new create endpoint request object

acceleratorCount
required
integer
acceleratorProduct
string
object (dto.InferenceEngineArgs)
apiKeys
Array of strings
cpu
required
integer
description
string
engine
required
string (enum.BaseEngine)
Enum: "vllm" "tgi" "nim" "hf-transformers" "custom-model-server"
maxInstances
integer >= 1
memoryInGi
required
integer
minInstances
integer >= 1
modelId
required
string
name
required
string <= 20 characters
platform
required
string (enum.Platform)
Enum: "nvidia-gpu-passthrough" "nvidia-vgpu" "nvidia-mig" "amd-gpu-passthrough" "intel-amx-cpu" "cpu"
quantization
string (enum.Quantization)
Enum: "float8" "float16" "bfloat16"

Responses

Request samples

curl --request POST \
    --url https://www.nutanix.dev//api/enterpriseai/v1/endpoints \
    --header 'Accept: application/json' \
    --header 'Authorization: Basic <basic_auth_token>' \
    --header 'Content-Type: application/json' \
    --data '{
    "acceleratorCount": 0,
    "acceleratorProduct": "string",
    "advancedConfig": {
        "hfTransformersArgs": {
            "maxNumTokens": 0
        },
        "tgiArgs": {
            "customArgs": {
                "property1": "string",
                "property2": "string"
            },
            "maxNumTokens": 0
        },
        "vllmArgs": {
            "customArgs": {
                "property1": "string",
                "property2": "string"
            },
            "maxNumTokens": 0
        }
    },
    "apiKeys": [
        "string"
    ],
    "cpu": 0,
    "description": "string",
    "engine": "vllm",
    "maxInstances": 1,
    "memoryInGi": 0,
    "minInstances": 1,
    "modelId": "string",
    "name": "string",
    "platform": "nvidia-gpu-passthrough",
    "quantization": "float8"
}'

Response samples

Content type
application/json
{
  • "data": {
    },
  • "msg": "string"
}

delete endpoint

delete endpoint by ID

Authorizations:
BasicAuth
path Parameters
endpoint_id
required
string

endpoint ID

query Parameters
force
boolean

force delete

Responses

Request samples

curl --request DELETE \
    --url https://www.nutanix.dev//api/enterpriseai/v1/endpoints/{endpoint_id} \
    --header 'Accept: application/json' \
    --header 'Authorization: Basic <basic_auth_token>' \
    --header 'Content-Type: application/json' \
     

Response samples

Content type
application/json
{
  • "msg": "string"
}

getByID

get an endpoint by ID

Authorizations:
BasicAuth
path Parameters
endpoint_id
required
string

endpoint ID

query Parameters
expand
Array of strings

query param to denote what all extra fields to fetch

Responses

Request samples

curl --request GET \
    --url https://www.nutanix.dev//api/enterpriseai/v1/endpoints/{endpoint_id} \
    --header 'Accept: application/json' \
    --header 'Authorization: Basic <basic_auth_token>' \
    --header 'Content-Type: application/json' \
     

Response samples

Content type
application/json
{
  • "data": {
    },
  • "msg": "string"
}

list API keys

list API keys by endpoint ID

Authorizations:
BasicAuth
path Parameters
endpoint_id
required
string

endpoint ID

Responses

Request samples

curl --request GET \
    --url https://www.nutanix.dev//api/enterpriseai/v1/endpoints/apikeys/{endpoint_id} \
    --header 'Accept: application/json' \
    --header 'Authorization: Basic <basic_auth_token>' \
    --header 'Content-Type: application/json' \
     

Response samples

Content type
application/json
{
  • "data": {
    },
  • "msg": "string"
}

list endpoints

list endpoints

Authorizations:
BasicAuth
query Parameters
expand
Array of strings

query param to denote what all extra fields to fetch

limit
integer

limit

offset
integer

offset

owner_id
string

owner ID for which the endpoints have to be returned

Responses

Request samples

curl --request GET \
    --url https://www.nutanix.dev//api/enterpriseai/v1/endpoints \
    --header 'Accept: application/json' \
    --header 'Authorization: Basic <basic_auth_token>' \
    --header 'Content-Type: application/json' \
     

Response samples

Content type
application/json
{
  • "data": {
    },
  • "msg": "string"
}

list instances

get instances for a given endpoint ID

Authorizations:
BasicAuth
path Parameters
endpoint_id
required
string

endpoint ID

Responses

Request samples

curl --request GET \
    --url https://www.nutanix.dev//api/enterpriseai/v1/endpoints/{endpoint_id}/instances \
    --header 'Accept: application/json' \
    --header 'Authorization: Basic <basic_auth_token>' \
    --header 'Content-Type: application/json' \
     

Response samples

Content type
application/json
{
  • "data": {
    },
  • "msg": "string"
}

search endpoints

search endpoints

Authorizations:
BasicAuth
query Parameters
expand
Array of strings

query param to denote what all extra fields to fetch

Request Body schema: application/json
required

list options object

required
Array of objects (dto.FilterOptions)
limit
required
integer
offset
required
integer >= 0
required
Array of objects (dto.SortOptions)

Responses

Request samples

curl --request POST \
    --url https://www.nutanix.dev//api/enterpriseai/v1/endpoints/search \
    --header 'Accept: application/json' \
    --header 'Authorization: Basic <basic_auth_token>' \
    --header 'Content-Type: application/json' \
    --data '{
    "filters": [
        {
            "field": "string",
            "operation": "startsWith",
            "values": [
                "string"
            ]
        }
    ],
    "limit": 0,
    "offset": 0,
    "sort": [
        {
            "field": "string",
            "order": "ASCENDING"
        }
    ]
}'

Response samples

Content type
application/json
{
  • "data": {
    },
  • "msg": "string"
}

update endpoint

update an existing endpoint by ID

Authorizations:
BasicAuth
path Parameters
endpoint_id
required
string

endpoint ID

Request Body schema: application/json
required

update existing endpoint object

description
string
maxInstances
integer >= 1
minInstances
integer >= 1

Responses

Request samples

curl --request PATCH \
    --url https://www.nutanix.dev//api/enterpriseai/v1/endpoints/{endpoint_id} \
    --header 'Accept: application/json' \
    --header 'Authorization: Basic <basic_auth_token>' \
    --header 'Content-Type: application/json' \
    --data '{
    "description": "string",
    "maxInstances": 1,
    "minInstances": 1
}'

Response samples

Content type
application/json
{
  • "msg": "string"
}

update endpoint state

update state of an existing endpoint

Authorizations:
BasicAuth
path Parameters
endpoint_id
required
string

endpoint ID

Request Body schema: application/json
required

update state of existing endpoint object

state
required
string (enum.EndpointStateAction)
Enum: "hibernate" "activate"

Responses

Request samples

curl --request PATCH \
    --url https://www.nutanix.dev//api/enterpriseai/v1/endpoints/state/{endpoint_id} \
    --header 'Accept: application/json' \
    --header 'Authorization: Basic <basic_auth_token>' \
    --header 'Content-Type: application/json' \
    --data '{
    "state": "hibernate"
}'

Response samples

Content type
application/json
{
  • "msg": "string"
}

validate endpoint

validate endpoint object

Authorizations:
BasicAuth
query Parameters
endpoint_name
required
string

endpoint name

Responses

Request samples

curl --request POST \
    --url https://www.nutanix.dev//api/enterpriseai/v1/endpoints/validate \
    --header 'Accept: application/json' \
    --header 'Authorization: Basic <basic_auth_token>' \
    --header 'Content-Type: application/json' \
     

Response samples

Content type
application/json
{
  • "msg": "string"
}

licenses

add new license

add new license

Authorizations:
BasicAuth
Request Body schema: application/json
required

add license object

Array of objects (dto.License)

Responses

Request samples

curl --request PUT \
    --url https://www.nutanix.dev//api/enterpriseai/v1/licenses \
    --header 'Accept: application/json' \
    --header 'Authorization: Basic <basic_auth_token>' \
    --header 'Content-Type: application/json' \
    --data '{
    "licenses": [
        {
            "licenseClusterUuid": "string",
            "licenseKey": "string",
            "meterType": "VCPU"
        }
    ]
}'

Response samples

Content type
application/json
{
  • "msg": "string"
}

list licenses

list licenses

Authorizations:
BasicAuth
query Parameters
expand
Array of strings

query param to denote what all extra fields to fetch

Responses

Request samples

curl --request GET \
    --url https://www.nutanix.dev//api/enterpriseai/v1/licenses \
    --header 'Accept: application/json' \
    --header 'Authorization: Basic <basic_auth_token>' \
    --header 'Content-Type: application/json' \
     

Response samples

Content type
application/json
{
  • "data": {
    },
  • "msg": "string"
}

validate license

validate license

Authorizations:
BasicAuth
Request Body schema: application/json
required

validate license object

Array of objects (dto.License)

Responses

Request samples

curl --request POST \
    --url https://www.nutanix.dev//api/enterpriseai/v1/licenses/validate \
    --header 'Accept: application/json' \
    --header 'Authorization: Basic <basic_auth_token>' \
    --header 'Content-Type: application/json' \
    --data '{
    "licenses": [
        {
            "licenseClusterUuid": "string",
            "licenseKey": "string",
            "meterType": "VCPU"
        }
    ]
}'

Response samples

Content type
application/json
{
  • "msg": "string"
}

logs

download logs

download logs file from pod instances

Authorizations:
BasicAuth
query Parameters
entityType
required
string

Entity type. Examples: endpoint, model

entityId
required
string

Entity ID

instance
string

Instance name. It is required for the endpoint, and it's optional for the model.

Responses

Request samples

curl --request GET \
    --url https://www.nutanix.dev//api/enterpriseai/v1/logs/download \
    --header 'Accept: application/octet-stream' \
    --header 'Authorization: Basic <basic_auth_token>' \
    --header 'Content-Type: application/json' \
     

get logs

get logs from pod instances

Authorizations:
BasicAuth
query Parameters
entityType
required
string

Entity type (endpoint only). Note: In response filenames, this is in lowercase.

entityId
required
string

Entity ID

instance
required
string

Instance name

tailLines
integer

Number of lines to tail from the end of the logs

Responses

Request samples

curl --request GET \
    --url https://www.nutanix.dev//api/enterpriseai/v1/logs \
    --header 'Accept: application/json' \
    --header 'Authorization: Basic <basic_auth_token>' \
    --header 'Content-Type: application/json' \
     

Response samples

Content type
application/json
{
  • "data": {
    },
  • "msg": "string"
}

metrics

metrics deployment health

checks health of metrics deployment

Authorizations:
BasicAuth

Responses

Request samples

curl --request GET \
    --url https://www.nutanix.dev//api/enterpriseai/v1/metrics/health \
    --header 'Accept: application/json' \
    --header 'Authorization: Basic <basic_auth_token>' \
     

Response samples

Content type
application/json
{
  • "data": {
    },
  • "msg": "string"
}

query

sends metric request to metrics endpoint

Authorizations:
BasicAuth
Request Body schema: application/json
required

new metric query object

object (dto.ExtraMetricParams)
metricName
required
string (constants.MetricName)
Enum: "requestLatencyAverage" "requestLatencyPercentile99" "requestLatencyPercentile95" "requestLatencyPercentile50" "requestCount" "llmMetricsApiUsageCount" "cpuUsage" "memoryUsage" "diskUsage" "gpuUtil" "gpuCount" "usedGpuCount" "gpuMemoryUsage" "serviceHealth" "endpointStatusCount" "infrastructureSummary" "infrastructureSummaryNode" "timeToFirstTokenAverage" "timeToFirstTokenPercentile99" "timeToFirstTokenPercentile95" "timeToFirstTokenPercentile50" "timePerOutputTokenAverage" "timePerOutputTokenPercentile99" "timePerOutputTokenPercentile95" "timePerOutputTokenPercentile50" "requestThroughput" "inputTokenCount" "outputTokenCount" "outputTokenThroughput" "numRequestsRunning" "numRequestsWaiting" "modelApiUsageCount" "endpointApiUsageCount" "v4ApiUsageCount" "accessControlApiUsageCount" "logsApiUsageCount" "apikeyLastUsedAt"
outputType
required
string (constants.MetricOutputType)
Enum: "timeseries" "table"

Responses

Request samples

curl --request POST \
    --url https://www.nutanix.dev//api/enterpriseai/v1/metrics/query \
    --header 'Accept: application/json' \
    --header 'Authorization: Basic <basic_auth_token>' \
    --header 'Content-Type: application/json' \
    --data '{
    "extraMetricParams": {
        "downsamplingInterval": 0,
        "end": 0,
        "filters": {
            "property1": [
                "string"
            ],
            "property2": [
                "string"
            ]
        },
        "groupBy": [
            "endpointName"
        ],
        "limit": 0,
        "sortOrder": "ASC",
        "start": 0
    },
    "metricName": "requestLatencyAverage",
    "outputType": "timeseries"
}'

Response samples

Content type
application/json
{
  • "data": {
    },
  • "msg": "string"
}

models

create model

create a new model

Authorizations:
BasicAuth
Request Body schema: application/json
required

new create model request object

object (dto.ModelProvider)
name
required
string <= 32 characters
sourceFormat
required
string (enum.SourceModelFormat)
Enum: "hf" "nim"
object (dto.StorageProvider)

Responses

Request samples

curl --request POST \
    --url https://www.nutanix.dev//api/enterpriseai/v1/models \
    --header 'Accept: application/json' \
    --header 'Authorization: Basic <basic_auth_token>' \
    --header 'Content-Type: application/json' \
    --data '{
    "modelProvider": {
        "catalogId": "string",
        "customModelDetails": {
            "developer": "string",
            "modelCapabilities": [
                "string"
            ],
            "modelSizeInGB": 0
        }
    },
    "name": "string",
    "sourceFormat": "hf",
    "storageProvider": {
        "huggingFaceHub": {
            "repoId": "string",
            "repoVersion": "string"
        },
        "nfs": {
            "modelPath": "string",
            "serverIp": "string",
            "sharePath": "string"
        },
        "s3": {
            "accessKey": "string",
            "bucket": "string",
            "endpoint": "string",
            "modelPath": "string",
            "secretKey": "string"
        }
    }
}'

Response samples

Content type
application/json
{
  • "data": {
    },
  • "msg": "string"
}

delete model

delete a model by ID

Authorizations:
BasicAuth
path Parameters
model_id
required
string

model ID

query Parameters
force
boolean

force delete

Responses

Request samples

curl --request DELETE \
    --url https://www.nutanix.dev//api/enterpriseai/v1/models/{model_id} \
    --header 'Accept: application/json' \
    --header 'Authorization: Basic <basic_auth_token>' \
    --header 'Content-Type: application/json' \
     

Response samples

Content type
application/json
{
  • "msg": "string"
}

get model capabilities

get model capabilities

Authorizations:
BasicAuth

Responses

Request samples

curl --request GET \
    --url https://www.nutanix.dev//api/enterpriseai/v1/models/capabilities \
    --header 'Accept: application/json' \
    --header 'Authorization: Basic <basic_auth_token>' \
    --header 'Content-Type: application/json' \
     

Response samples

Content type
application/json
{
  • "data": {
    },
  • "msg": "string"
}

list models

list models

Authorizations:
BasicAuth
query Parameters
name
string

model name

owner_id
string

owner ID for which the models have to be returned

limit
integer

limit

offset
integer

offset

expand
Array of strings

query param to denote what all extra fields to fetch

Responses

Request samples

curl --request GET \
    --url https://www.nutanix.dev//api/enterpriseai/v1/models/list \
    --header 'Accept: application/json' \
    --header 'Authorization: Basic <basic_auth_token>' \
    --header 'Content-Type: application/json' \
     

Response samples

Content type
application/json
{
  • "data": {
    },
  • "msg": "string"
}

model by ID

get a model by ID

Authorizations:
BasicAuth
path Parameters
model_id
required
string

model ID

query Parameters
expand
Array of strings

query param to denote what all extra fields to fetch

Responses

Request samples

curl --request GET \
    --url https://www.nutanix.dev//api/enterpriseai/v1/models/{model_id} \
    --header 'Accept: application/json' \
    --header 'Authorization: Basic <basic_auth_token>' \
    --header 'Content-Type: application/json' \
     

Response samples

Content type
application/json
{
  • "data": {
    },
  • "msg": "string"
}

search

search models

Authorizations:
BasicAuth
query Parameters
expand
Array of strings

query param to denote what all extra fields to fetch

Request Body schema: application/json
required

list options object

required
Array of objects (dto.FilterOptions)
limit
required
integer
offset
required
integer >= 0
required
Array of objects (dto.SortOptions)

Responses

Request samples

curl --request POST \
    --url https://www.nutanix.dev//api/enterpriseai/v1/models/search \
    --header 'Accept: application/json' \
    --header 'Authorization: Basic <basic_auth_token>' \
    --header 'Content-Type: application/json' \
    --data '{
    "filters": [
        {
            "field": "string",
            "operation": "startsWith",
            "values": [
                "string"
            ]
        }
    ],
    "limit": 0,
    "offset": 0,
    "sort": [
        {
            "field": "string",
            "order": "ASCENDING"
        }
    ]
}'

Response samples

Content type
application/json
{
  • "data": {
    },
  • "msg": "string"
}

providers

create provider

create a new provider

Authorizations:
BasicAuth
Request Body schema: application/json
required

provider

credential
required
string
hostname
required
string
name
required
string <= 20 characters
port
required
integer <= 65535
prefix
string
schema
required
string (enum.Schema)
Enum: "openai" "gcp" "anthropic"
type
required
string (enum.LLMProviderType)
Enum: "nai" "openai" "anthropic"

Responses

Request samples

curl --request POST \
    --url https://www.nutanix.dev//api/enterpriseai/v1/providers \
    --header 'Accept: application/json' \
    --header 'Authorization: Basic <basic_auth_token>' \
    --header 'Content-Type: application/json' \
    --data '{
    "credential": "string",
    "hostname": "string",
    "name": "string",
    "port": 65535,
    "prefix": "string",
    "schema": "openai",
    "type": "nai"
}'

Response samples

Content type
application/json
{
  • "data": "string",
  • "msg": "string"
}

delete provider

delete a provider by ID

Authorizations:
BasicAuth
path Parameters
provider_id
required
string

provider ID

Responses

Request samples

curl --request DELETE \
    --url https://www.nutanix.dev//api/enterpriseai/v1/providers/{provider_id} \
    --header 'Accept: application/json' \
    --header 'Authorization: Basic <basic_auth_token>' \
    --header 'Content-Type: application/json' \
     

Response samples

Content type
application/json
{
  • "msg": "string"
}

list providers

list all providers

Authorizations:
BasicAuth

Responses

Request samples

curl --request GET \
    --url https://www.nutanix.dev//api/enterpriseai/v1/providers \
    --header 'Accept: application/json' \
    --header 'Authorization: Basic <basic_auth_token>' \
    --header 'Content-Type: application/json' \
     

Response samples

Content type
application/json
{
  • "data": {
    },
  • "msg": "string"
}

provider by ID

get a provider by ID

Authorizations:
BasicAuth
path Parameters
provider_id
required
string

provider ID

Responses

Request samples

curl --request GET \
    --url https://www.nutanix.dev//api/enterpriseai/v1/providers/{provider_id} \
    --header 'Accept: application/json' \
    --header 'Authorization: Basic <basic_auth_token>' \
    --header 'Content-Type: application/json' \
     

Response samples

Content type
application/json
{
  • "data": {
    },
  • "msg": "string"
}

update provider

update a provider by ID

Authorizations:
BasicAuth
path Parameters
provider_id
required
string

provider ID

Request Body schema: application/json
required

provider

credential
string

Responses

Request samples

curl --request PATCH \
    --url https://www.nutanix.dev//api/enterpriseai/v1/providers/{provider_id} \
    --header 'Accept: application/json' \
    --header 'Authorization: Basic <basic_auth_token>' \
    --header 'Content-Type: application/json' \
    --data '{
    "credential": "string"
}'

Response samples

Content type
application/json
{
  • "msg": "string"
}

unified-endpoints

create unified endpoint

create a new unified endpoint

Authorizations:
BasicAuth
Request Body schema: application/json
required

new create unified endpoint request object

apiKeys
Array of strings
mode
required
string (enum.UnifiedEndpointMode)
Enum: "loadbalancer" "fallback"
name
required
string <= 20 characters
rateLimitCelExpression
string
object (dto.RateLimitConfig)
required
Array of objects (dto.TargetEndpointConfig) non-empty

Responses

Request samples

curl --request POST \
    --url https://www.nutanix.dev//api/enterpriseai/v1/unified-endpoints \
    --header 'Accept: application/json' \
    --header 'Authorization: Basic <basic_auth_token>' \
    --header 'Content-Type: application/json' \
    --data '{
    "apiKeys": [
        "string"
    ],
    "mode": "loadbalancer",
    "name": "string",
    "rateLimitCelExpression": "string",
    "rateLimitConfig": {
        "cost": null,
        "durationUnit": null,
        "value": 0
    },
    "targetEndpointConfigs": [
        {
            "endpointType": "internal",
            "internalEndpointId": "string",
            "priority": 0,
            "providerId": "string",
            "providerModelName": "string",
            "weight": 0
        }
    ]
}'

Response samples

Content type
application/json
{
  • "data": {
    },
  • "msg": "string"
}

delete

delete a unified endpoint

Authorizations:
BasicAuth
path Parameters
unified_endpoint_id
required
string

unified endpoint ID

Responses

Request samples

curl --request DELETE \
    --url https://www.nutanix.dev//api/enterpriseai/v1/unified-endpoints/{unified_endpoint_id} \
    --header 'Accept: application/json' \
    --header 'Authorization: Basic <basic_auth_token>' \
    --header 'Content-Type: application/json' \
     

Response samples

Content type
application/json
{
  • "msg": "string"
}

getByID

get a unified endpoint by ID

Authorizations:
BasicAuth
path Parameters
unified_endpoint_id
required
string

unified endpoint ID

Responses

Request samples

curl --request GET \
    --url https://www.nutanix.dev//api/enterpriseai/v1/unified-endpoints/{unified_endpoint_id} \
    --header 'Accept: application/json' \
    --header 'Authorization: Basic <basic_auth_token>' \
    --header 'Content-Type: application/json' \
     

Response samples

Content type
application/json
{
  • "data": {
    },
  • "msg": "string"
}

list API rate limits

list all API key rate limit configurations for a unified endpoint

Authorizations:
BasicAuth
path Parameters
unified_endpoint_id
required
string

unified endpoint ID

Responses

Request samples

curl --request GET \
    --url https://www.nutanix.dev//api/enterpriseai/v1/unified-endpoints/{unified_endpoint_id}/ratelimit \
    --header 'Accept: application/json' \
    --header 'Authorization: Basic <basic_auth_token>' \
    --header 'Content-Type: application/json' \
     

Response samples

Content type
application/json
{
  • "data": [
    ],
  • "msg": "string"
}

list unified endpoints

list all unified endpoints

Authorizations:
BasicAuth

Responses

Request samples

curl --request GET \
    --url https://www.nutanix.dev//api/enterpriseai/v1/unified-endpoints \
    --header 'Accept: application/json' \
    --header 'Authorization: Basic <basic_auth_token>' \
    --header 'Content-Type: application/json' \
     

Response samples

Content type
application/json
{
  • "data": {
    },
  • "msg": "string"
}

update

update a unified endpoint

Authorizations:
BasicAuth
path Parameters
unified_endpoint_id
required
string

unified endpoint ID

Request Body schema: application/json
required

update unified endpoint request object

mode
string (enum.UnifiedEndpointMode)
Enum: "loadbalancer" "fallback"
rateLimitCelExpression
string
object (dto.RateLimitConfig)
Array of objects (dto.TargetEndpointConfig) non-empty

Responses

Request samples

curl --request PUT \
    --url https://www.nutanix.dev//api/enterpriseai/v1/unified-endpoints/{unified_endpoint_id} \
    --header 'Accept: application/json' \
    --header 'Authorization: Basic <basic_auth_token>' \
    --header 'Content-Type: application/json' \
    --data '{
    "mode": "loadbalancer",
    "rateLimitCelExpression": "string",
    "rateLimitConfig": {
        "cost": null,
        "durationUnit": null,
        "value": 0
    },
    "targetEndpointConfigs": [
        {
            "endpointType": "internal",
            "internalEndpointId": "string",
            "priority": 0,
            "providerId": "string",
            "providerModelName": "string",
            "weight": 0
        }
    ]
}'

Response samples

Content type
application/json
{
  • "msg": "string"
}

update rate limit

update rate limit configuration for API keys on a unified endpoint

Authorizations:
BasicAuth
path Parameters
unified_endpoint_id
required
string

unified endpoint ID

Request Body schema: application/json
required

API key specific rate limit configurations

Array
apiKeyId
required
string
object (dto.RateLimitConfig)

Responses

Request samples

curl --request PUT \
    --url https://www.nutanix.dev//api/enterpriseai/v1/unified-endpoints/{unified_endpoint_id}/ratelimit \
    --header 'Accept: application/json' \
    --header 'Authorization: Basic <basic_auth_token>' \
    --header 'Content-Type: application/json' \
     

Response samples

Content type
application/json
{
  • "msg": "string"
}

users

Logged in user details

retrieves user details of the logged in user

Authorizations:
BasicAuth

Responses

Request samples

curl --request GET \
    --url https://www.nutanix.dev//api/enterpriseai/v1/users/me \
    --header 'Accept: application/json' \
    --header 'Authorization: Basic <basic_auth_token>' \
     

Response samples

Content type
application/json
{
  • "data": {
    },
  • "msg": "string"
}

version

Get version

retrieve the version information of the NAI Inference Server

Responses

Request samples

curl --request GET \
    --url https://www.nutanix.dev//api/enterpriseai/v1/version \
    --header 'Accept: application/json' \
    --header 'Content-Type: application/json' \
     

Response samples

Content type
application/json
{
  • "data": {
    },
  • "msg": "string"
}