Documentation
Structured Output
You can enforce a particular response format from an LLM by providing a JSON schema to the /v1/chat/completions
endpoint, via LM Studio's REST API (or via any OpenAI client).
To use LM Studio programatically from your own code, run LM Studio as a local server.
You can turn on the server from the "Developer" tab in LM Studio, or via the lms
CLI:
lms
by running npx lmstudio install-cli
This will allow you to interact with LM Studio via an OpenAI-like REST API. For an intro to LM Studio's OpenAI-like API, see Running LM Studio as a server.
The API supports structured JSON outputs through the /v1/chat/completions
endpoint when given a JSON schema. Doing this will cause the LLM to respond in valid JSON conforming to the schema provided.
It follows the same format as OpenAI's recently announced Structured Output API and is expected to work via the OpenAI client SDKs.
Example using curl
This example demonstrates a structured output request using the curl
utility.
To run this example on Mac or Linux, use any terminal. On Windows, use Git Bash.
All parameters recognized by /v1/chat/completions
will be honored, and the JSON schema should be provided in the json_schema
field of response_format
.
The JSON object will be provided in string
form in the typical response field, choices[0].message.content
, and will need to be parsed into a JSON object.
Example using python
Important: Not all models are capable of structured output, particularly LLMs below 7B parameters.
Check the model card README if you are unsure if the model supports structured output.
GGUF
models: utilize llama.cpp
's grammar-based sampling APIs.MLX
models: using Outlines.The MLX implementation is available on Github: lmstudio-ai/mlx-engine.
Chat with other LM Studio users, discuss LLMs, hardware, and more on the LM Studio Discord server.