Deepseek AI can run on OpenSUSE with 16GB of Ram

Dear all!

What HW do you have for running local AI? — Dedicated GPU/NPU?
How fast is your current setup?

I have this:

~> inxi -v 1
System:
  Host: tulicube Kernel: 6.4.0-150600.23.33-default arch: x86_64 bits: 64
  Desktop: Xfce v: 4.18.1 Distro: openSUSE Leap 15.6
CPU:
  Info: quad core Intel Core i3-9100 [MCP] speed (MHz): avg: 800
    min/max: 800/4200
Graphics:
  Device-1: Intel CoffeeLake-S GT2 [UHD Graphics 630] driver: i915 v: kernel
  Display: x11 server: X.org v: 1.21.1.11 driver: X: loaded: modesetting
    unloaded: fbdev,vesa dri: iris gpu: i915 resolution: 1920x1080~60Hz
  API: OpenGL v: 4.6 vendor: intel mesa v: 23.3.4 renderer: Mesa Intel UHD
    Graphics 630 (CFL GT2)
  Info: Tools: api: glxinfo de: xfce4-display-settings gpu: gputop,
    intel_gpu_top, lsgpu x11: xprop,xrandr
Drives:
  Local Storage: total: 476.94 GiB used: 270.22 GiB (56.7%)
Info:
  Memory: total: 16 GiB available: 15.47 GiB used: 3.49 GiB (22.5%)
  Processes: 245 Uptime: 5h 23m Shell: Bash inxi: 3.3.37

and my LM Studio (basic install) is pretty slow, almost unusable! (Just for office use my system is fine.)

I’m using chatbox + ollama and have downloaded all models from deepseek.
r1-70b can be a bit slow because it is big :smiley:
the other models run just fine.

System:
  Host: ... Kernel: 6.4.0-150600.23.33-default arch: x86_64 bits: 64
    Console: pty pts/1 Distro: openSUSE Leap 15.6
CPU:
  Info: 2x 18-core Intel Xeon Gold 6154 [MT MCP SMP] 
Graphics:
  Device-1: NVIDIA GP107GL [Quadro P1000] driver: nvidia v: 570.86.16
  Display: server: X.Org v: 1.21.1.11 with: Xwayland v: 24.1.1 driver: X:
    loaded: nvidia gpu: nvidia,nvidia-nvswitch resolution: 1: 2560x1440
    2: 2560x1440
  API: OpenGL v: 4.6.0 NVIDIA 570.86.16 renderer: Quadro P1000/PCIe/SSE2
Info:
  Memory: available: 187.51 GiB  used: 7.14 GiB (3.8%) 

I need to rebuild my setup… But I used Kubernetes and Nvidia container running ollama and open-webui
Low power cpu and RAM, but added a Nvidia Tesla P4 for compute…



Only took seconds to run queries…

@ru1marante How have you used download? I just find “use API mode”. (Maybe I don’t really see it, but so far I really don’t get it…)

@malcolmlewis @ru1marante

Well, I just have the i3-9x AND no dedicated graphics at all. Therefore, your systems seem to be much more powerful than mine. I whish you “Happy using!” — here it’s quite not possible.

Have a look at the documentation how to download models.

so, step by step:

with your normal user console:
1 - cd Downloads
2 - curl https://ollama.com/install.sh > install.sh
3 - chmod +x ./install.sh

then with root user:
4 - # sh install.sh

… it should install just fine… wait for it to finish.

then get the deepseek AI models that you want:
5 - # ollama run deepseek-r1:1.5b

other available models (that you might want to try):

  • deepseek-r1:70b (~40GB BIG model… use it only if you have a decent powerful machine :slight_smile: )
  • deepseek-r1:8b
  • deepseek-r1:14b
  • deepseek-r1:32b

6 - download an app so that you can query the AI models in a simple user interface (ChatBox)
(there are for sure other tools available)

side note:
you might want to change the ollama service not to start at boot time… maybe you want it to start manually only (check systemctl status ollama.service) - and change the startup mode if you want

that’s it!
enjoy!!

I can tell you that even with a decent powerful machine… querying the r1-70b can sometimes heat the office :smiley: :smiley:

@C7NhtpnK
you can also check this little video:

Thank you!

(Full answer below in post to ru1marante )

Thank you very much!

Yes, this is clear now.

It was my fault, sorry: I have been trying LM Studio and Jan. There, downloads (or imorts of already downloaded files otherwise) are done from within the GUI. I supposed this hold for Chatbox AI, too, which does actually not and has to be done on underliing ollama.

Impressive! :slight_smile:

What kind of top is that on your screenshot, @ru1marante ?

@C7NhtpnK
It was custom created by me… (bash script) the print screen is a cut of part of it… it is used to monitor several HW indicators/sensors and it can run in the background without user interface to report indicators and alarms to another system (a database).

BUT, I like to use glances or bpytop (but bpytop interface is not really great with 2 CPU / 72 cores… so it gets messy)

But glances is very nice, in my opinion