LM Studio 0.3.15: RTX 50-series GPUs and improved tool use in the API
LM Studio 0.3.15 is available now as a stable release. This version includes support for NVIDIA RTX 50-series GPUs (CUDA 12), UI touchups including a new system prompt editor UI. Moreover, we have improved the API support for tool use (tool_choice
parameter) and added a new option to log each generated fragment to API server logs.
LM Studio now supports RTX 50-series GPUs (CUDA 12.8) with our llama.cpp
engines on Windows and Linux. This change makes first-time model load times on RTX 50-series GPUs fast as expected. For machines with RTX 50-series GPUs, LM Studio will automatically upgrade to CUDA 12 if the NVIDIA driver version is compatible.
The minimum driver versions are:
If you have an RTX 50-series GPU and the driver version is compatible, LM Studio will automatically upgrade to CUDA 12.
If you have an RTX 50-series GPU and the driver version is not compatible, LM Studio will continue to use CUDA 11. Manage this in Ctrl + Shift + R
.
System prompt are a powerful way to customize the behavior of your models. They can be just a few words, or sometimes multiple pages long. LM Studio 0.3.15 introduces a much larger visual space to edit longer prompts. You can still use the mini prompt editor in the sidebar.
The OpenAI-like REST API now supports the tool_choice
parameter, which allows you to control how the model uses tools. The tool_choice
parameter can take three values:
"tool_choice": "none"
- Model will not call any tools"tool_choice": "auto"
- Model decides whether or not to call tools"tool_choice": "required"
- Forces model to only output tools (llama.cpp engines only)We've also fixed a bug in LM Studio's OpenAI-compatibility mode where the chunk "finish_reason" was not set to "tool_calls" when appropriate.
Presets are a convenient way to package together system prompts and model parameters.
Starting in LM Studio 0.3.15, you can share your presets with the community and download presets made by other users via the web ☁️. You can also like and fork presets made by others.
Enable this feature in Settings > General > Enable publishing and downloading presets. Once turned on, you'll find a new "Publish" button when you right-click on a preset in the sidebar. This will allow you to publish your preset to the community.
Snag your username at https://lmstudio.ai/login and start sharing your presets! You do not need an account to download presets, just to publish.
This feature is currently in Preview, and we are looking for feedback from everyone. If you have any suggestions or issues, please let us know: [email protected].
**Build 11** - Llama 4 prompt template fixes to improve tool call reliability **Build 10** - Preview: Add the ability to publish and download presets from the community (head to Settings to enable) - Add `tool_choice` parameter support to OpenAI-like REST API - `"tool_choice": "none"` - Model will not call any tools - `"tool_choice": "auto"` - Model decides whether or not to call tools - `"tool_choice": "required"` - Forces model to only output tools (llama.cpp engines only) - Added an option to log each generated fragment to API server logs - Fixed the erroneous "Client disconnected. Stopping generation..." message when using the API server - Fixed a front end error when using the preset selection in the developer page - Fix for GLM prompt template - Fix Llama 4 prompt template bug "Unknown ArrayValue filter: trim" when using tools **Build 9** - Fix: Ensure OpenAI-like REST API chunk "finish_reason" is "tool_calls" when appropriate - Fixes "N/A" token count in system prompt editor when model is loaded **Build 8** - Experimental feature behind flag in Chat Appearance, smooth autoscroll latest chat message to top **Build 7** - [CUDA12] Fix incorrect VRAM capacity showing on Hardware page on some machines - Fix Llama 4 crashes when using GPU settings: priority order, limit offload to dedicated GPU memory - [GGUF] Fixed bug where top-k sampling parameter could not be set to 0 - [MLX] Removed the checkbox from top-k sampling parameter **Build 6** - Chat terminal message styling updates - Conversation font scale introduced in chat Appearance tab - Conversation font weight introduced in chat Appearance tab **Build 5** - [CUDA] CUDA 12 engine auto-upgrade if driver is compatible and *any* GPU is 50-series and above - [MLX] Add top-k sampler **Build 4** - New: CUDA 12 support in LM Studio's llama.cpp engines (Windows/Linux) - Dramatically faster first-time model load times on RTX 50-series GPUs - Initial compatibility requirements: - NVIDIA driver version: - Windows: 551.61 or newer - Linux: 550.54.14 or newer - At least one GPU of the following: - GeForce RTX 5090, RTX 5080, RTX 5070 Ti, or RTX 5070 - Datacenter GPU with Hopper or Blackwell micro-architecture - App will automatically upgrade you if your machine is compatible - Check your system compatibility by running `nvidia-smi` in terminal - Added support for sorting models by last load time in the model loader (the new default) - Adds new system prompt editor UI - Adds a toggle to hide/show advanced settings while loading models - Fix Cogito jinja parsing error "Unexpected character: ~" - Fixes downloads pane resize bug **Build 3** - Fixed lms CLI sometimes not initializing properly on MacOS **Build 2** - Fixes bug where the chat sidebar labels would overflow - Fixes bug where the downloads pane would open at wrong position **Build 1** - UI touchups: - New and improved chat input box - Neatened up app action bar layout - Slimmer app sidebar - Chat sidebar segments: Context and Model