Documentation
Running LLMs Locally
User Interface
Advanced
Command Line Interface - lms
API
REST Endpoints
LM Studio SDK (TypeScript)
OpenAI Compatibility API
Send requests to Chat Completions (text and images), Completions, and Embeddings endpoints.
LM Studio accepts requests on several OpenAI endpoints and returns OpenAI-like response objects.
GET /v1/models POST /v1/chat/completions POST /v1/embeddings POST /v1/completions
You can reuse existing OpenAI clients (in Python, JS, C#, etc) by switching up the "base URL" property to point to your LM Studio instead of OpenAI's servers.
base url
to point to LM Studio1234
from openai import OpenAI client = OpenAI( + base_url="http://localhost:1234/v1" ) # ... the rest of your code ...
import OpenAI from 'openai'; const client = new OpenAI({ + baseUrl: "http://localhost:1234/v1" }); // ... the rest of your code ...
- curl https://api.openai.com/v1/chat/completions \ + curl http://localhost:1234/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{ - "model": "gpt-4o-mini", + "model": "use the model identifier from LM Studio here", "messages": [{"role": "user", "content": "Say this is a test!"}], "temperature": 0.7 }'
/v1/models
GET
requestcurl http://localhost:1234/v1/models
/v1/chat/completions
POST
requestlms log stream
to see what input the model receives# Example: reuse your existing OpenAI setup from openai import OpenAI # Point to the local server client = OpenAI(base_url="http://localhost:1234/v1", api_key="lm-studio") completion = client.chat.completions.create( model="model-identifier", messages=[ {"role": "system", "content": "Always answer in rhymes."}, {"role": "user", "content": "Introduce yourself."} ], temperature=0.7, ) print(completion.choices[0].message)
/v1/embeddings
POST
request# Make sure to `pip install openai` first from openai import OpenAI client = OpenAI(base_url="http://localhost:1234/v1", api_key="lm-studio") def get_embedding(text, model="model-identifier"): text = text.replace("\n", " ") return client.embeddings.create(input = [text], model=model).data[0].embedding print(get_embedding("Once upon a time, there was a cat."))
/v1/completions
This OpenAI-like endpoint is no longer supported by OpenAI. LM Studio continues to support it.
Using this endpoint with chat-tuned models might result in unexpected behavior such as extraneous role tokens being emitted by the model.
For best results, utilize a base model.
POST
requestlms log stream
to see what input the model receivesFor an explanation for each parameter, see https://platform.openai.com/docs/api-reference/chat/create.
model top_p top_k messages temperature max_tokens stream stop presence_penalty frequency_penalty logit_bias repeat_penalty seed
Chat with other LM Studio developers, discuss LLMs, hardware, and more on the LM Studio Discord server.