[LM Studio AppImage] 2 tokens per second on Meteor Lake U-series iGPU with Llama 3.2

I don’t know why running Llama 3.2 3B is so slow with iGPU with only 2 tokens per second running on openSUSE Tumbleweed 20250626 KDE 6.4.1 Wayland.




Oof I think I found out why now. I was comparing it to another machine running Ollama that has only iGPU but I just remembered that it doesn’t even fully utilize the iGPU (I believe it was from the ollama ps command) but rather a portion of it. But for LM Studio, it offloads it all, making it actually run slower with fewer tokens.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.