Generate Embeddings
Generate embeddings from a model.
Generate
This API lets you ask questions to the LLMs in a synchronous way.
Generate with Thinking
This API allows to generate responses from an LLM while also retrieving the model's "thinking" process separately from the final answer. The "thinking" tokens represent the model's internal reasoning or planning before it produces the actual response. This can be useful for debugging, transparency, or simply understanding how the model arrives at its answers.
Generate with Images
This API lets you ask questions along with the image files to the LLMs.
Generate with Tools
This API lets you perform tool/function calling using LLMs in a
Generate (Async)
Generate response from a model asynchronously
Chat
This API lets you create a conversation with LLMs. Using this API enables you to ask questions to the model including
Chat with Thinking
This API allows to generate responses from an LLM while also retrieving the model's "thinking" process separately from
Chat with Tools
Using Tools in Chat
Chat with MCP Tools
This is an experimental feature and is subject to change in order to improve the usage experience. Contributions are welcome.
Custom Roles
Allows to manage custom roles (apart from the base roles) for chat interactions with the models.