LM Studio 0.3.2

2024-08-27

Release Notes

What's new in 0.3.2

  • New: Ability to pin models to the top is back!
    • Right-click on a model in My Models and select "Pin to top" to pin it to the top of the list.
  • Chat migration dialog now appears in the chat side bar.
    • You can migrate your chats from pre-0.3.0 versions from there.
    • As of v0.3.1, system prompts now migrated as well.
    • Your old chats are NOT deleted.
  • Don't show a badge with a number on the downloads button if there are no downloads
  • Added a button to collapse the FAQ sidebar in the Discover tab
  • Reduced default context size from 8K to 4K tokens to alleviate Out of Memory issues
    • You can still configure any context size you want in the model load settings
  • Added a warning next to the Flash Attention in model load settings
    • Flash Attention is experimental and may not be suitable for all models
  • Updated the bundled llama.cpp engine to 3246fe84d78c8ccccd4291132809236ef477e9ea (Aug 27)

Bug Fixes

  • Bug fix: My Models model size aggregation was incorrect when you had multi-part model files
  • Bug fix: (Linux) RAG would fail due to missing bundled embedding model (fixed)
  • Bug fix: Flash Attention - KV cache quantization is back to FP16 by default
    • In 0.3.0, the K and V were both set to Q8, which introduced large latencies in some cases
      • You might notice an increase in memory consumption when FA is ON compared with 0.3.1, but on par with 0.2.31
  • Bug fix: On some setups app would hang at start up (fix + mitigation)
  • Bug fix: Fixed an issue where the downloads panel was dragged over the app top bar
  • Bug fix: Fixed typos in built-in code snippets (server tab)

Even More