I don’t know why running Llama 3.2 3B is so slow with iGPU with only 2 tokens per second running on openSUSE Tumbleweed 20250626 KDE 6.4.1 Wayland.
Oof I think I found out why now. I was comparing it to another machine running Ollama that has only iGPU but I just remembered that it doesn’t even fully utilize the iGPU (I believe it was from the ollama ps
command) but rather a portion of it. But for LM Studio, it offloads it all, making it actually run slower with fewer tokens.
This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.