Audrey M. Roy Greenfeld

Quietly building the future.

Installing Ollama on Vast.ai

2024-07-08

Installing Ollama on Vast.ai

Sign up for Vast.ai using my referral link, if my notes are helpful to you.

Add your SSH key to https://cloud.vast.ai/account/

Choose ollama/ollama:0.1.38 as the Docker image. (This succeeded for me, whereas ollama/ollama:latest failed.)

Create a new instance. I picked a $0.111/hr RTX 2080 Ti with 11GB RAM. A better GPU and more RAM might be better, but I wanted to keep costs low to start.

That one failed when I ran ollama run mixtral, so I tried a $0.144/hr RTX A4000 with 16GB RAM.

Once it started up, I clicked the Connect button, copied the SSH command, and pasted it into my terminal.

ssh -p root@ -L 8080:localhost:8080

I couldn't get ollama to run with systemd, so I ran it in the background with:

(curl -fsSL https://ollama.com/install.sh | sh && ollama serve > ollama.log 2>&1) &

Then I was able to run ollama from the terminal. Rather than trying to run mixtral, I ran a tiny model, orca-mini, which worked fine:

ollama run orca-mini

Then I was able to chat with the model in the terminal!

I tried to run it via the API as well, but that didn't work:

curl http://<ip>:11434/api/chat -d '{
  "model": "orca-mini",
  "messages": [
    { "role": "user", "content": "hey there, how are you doing?" }
  ]
}'
curl: (7) Failed to connect to <ip> port 11434 after 88 ms: Couldn't connect to server

Next up, I'll repeat roughly these steps on another host with better support for external IPs, so I can try the API again.