Sitemap

How to run Ollama on specific GPU(s)

mlapi
1 min readAug 2, 2024

In this tutorial we will see how to specify any GPU for ollama or multiple GPUs.

Photo by Bonnie Kittle on Unsplash

This tutorials is only for linux machine.

This is very simple, all we need to do is to set CUDA_VISIBLE_DEVICES to a specific GPU(s).

Head over to /etc/systemd/system

cd /etc/systemd/system
cat ollama.service

After priting the output you will see something like this —

[Unit]
Description=Ollama Service
After=network-online.target

[Service]
ExecStart=/usr/local/bin/ollama serve
User=ollama
Group=ollama
Restart=always
RestartSec=3
Environment="PATH="

[Install]
WantedBy=default.target

In this just before we server the ollama model we need to set GPU devices like this, so above will look like this

For single GPU

[Unit]
Description=Ollama Service
After=network-online.target

[Service]
Environment="CUDA_VISIBLE_DEVICES=0"
ExecStart=/usr/local/bin/ollama serve
User=ollama
Group=ollama
Restart=always
RestartSec=3
Environment="PATH="

[Install]
WantedBy=default.target

For multiple GPUs

[Unit]
Description=Ollama Service
After=network-online.target

[Service]
Environment="CUDA_VISIBLE_DEVICES=0,1,2"
ExecStart=/usr/local/bin/ollama serve
User=ollama
Group=ollama
Restart=always
RestartSec=3
Environment="PATH="

[Install]
WantedBy=default.target

This is it, now restart your service file like this.

sudo systemctl daemon-reload
sudo systemctl restart ollama.service

--

--

mlapi
mlapi

No responses yet