How to run Ollama on specific GPU(s)
1 min readAug 2, 2024
In this tutorial we will see how to specify any GPU for ollama or multiple GPUs.
This tutorials is only for linux machine.
This is very simple, all we need to do is to set CUDA_VISIBLE_DEVICES to a specific GPU(s).
Head over to /etc/systemd/system
cd /etc/systemd/system
cat ollama.service
After priting the output you will see something like this —
[Unit]
Description=Ollama Service
After=network-online.target
[Service]
ExecStart=/usr/local/bin/ollama serve
User=ollama
Group=ollama
Restart=always
RestartSec=3
Environment="PATH="
[Install]
WantedBy=default.target
In this just before we server the ollama model we need to set GPU devices like this, so above will look like this
For single GPU
[Unit]
Description=Ollama Service
After=network-online.target
[Service]
Environment="CUDA_VISIBLE_DEVICES=0"
ExecStart=/usr/local/bin/ollama serve
User=ollama
Group=ollama
Restart=always
RestartSec=3
Environment="PATH="
[Install]
WantedBy=default.target
For multiple GPUs
[Unit]
Description=Ollama Service
After=network-online.target
[Service]
Environment="CUDA_VISIBLE_DEVICES=0,1,2"
ExecStart=/usr/local/bin/ollama serve
User=ollama
Group=ollama
Restart=always
RestartSec=3
Environment="PATH="
[Install]
WantedBy=default.target
This is it, now restart your service file like this.
sudo systemctl daemon-reload
sudo systemctl restart ollama.service