Data Engineering at the University of Florida
This guide walks you through moving a NavigatorAI workflow (the one exercised by cegme/navigator-cli) onto a local Ollama server running on a HiPerGator compute node.
The end state is an OpenAI-compatible endpoint at http://localhost:11434/v1 that your existing navigator-cli and MCP code can hit with one URL change.
NavigatorAI is the right default for assignments because it handles hosting, routing, and auth. Local Ollama becomes attractive once you need any of the following:
Ollama speaks the same chat/completions schema as NavigatorAI, so every --system, --model, and --mcp-server flag in navigator-cli keeps working.
Only the base URL and API key change.
cis6930 account and QoS.hpg.rc.ufl.edu from your laptop.cegme/navigator-cli and familiarity with its options (see the NavigatorAI Setup guide).Home directories on HiPerGator are capped near 40 GB and throttled for I/O.
Ollama models are large: Llama 3.1 8B is about 4.7 GB, Qwen2.5 14B is about 8 GB, and Llama 3.1 70B is about 40 GB on disk.
Store both the binary and the model cache under the class blue allocation at /blue/cis6930, which has the quota and the throughput for this.
# Run once from a login node
mkdir -p /blue/cis6930/$USER/ollama/bin
mkdir -p /blue/cis6930/$USER/ollama/models
Add the following to ~/.bashrc on HiPerGator so every shell picks up the right paths:
export OLLAMA_HOME=/blue/cis6930/$USER/ollama
export OLLAMA_MODELS=$OLLAMA_HOME/models
export PATH=$OLLAMA_HOME/bin:$PATH
Reload the shell with source ~/.bashrc and confirm echo $OLLAMA_MODELS points into /blue/cis6930.
If this variable is unset when Ollama runs, it will drop models into ~/.ollama and blow past your home quota.
The official curl | sh installer writes into /usr/local, which regular HiPerGator users cannot touch.
Grab the static Linux tarball directly instead:
cd $OLLAMA_HOME/bin
curl -L https://ollama.com/download/ollama-linux-amd64.tgz -o ollama.tgz
tar -xzf ollama.tgz
rm ollama.tgz
./ollama --version
The extracted layout places ollama on your PATH through the earlier export.
Run which ollama from a fresh shell to confirm the login shell resolves it from /blue/cis6930.
Ollama listens on TCP port 11434. Running it on a login node is against HiPerGator policy, so grab an interactive GPU session first. This example asks for one A100 for two hours, which is plenty for 7B-14B models:
srun --account=cis6930 \
--qos=cis6930 \
--partition=gpu \
--gres=gpu:a100:1 \
--ntasks=1 \
--cpus-per-task=4 \
--mem=32gb \
--time=02:00:00 \
--pty bash -i
Once the prompt lands on a compute node, capture the hostname. You will need it to build the SSH tunnel:
hostname # e.g. c0907a-s23.ufhpc
echo $HOSTNAME # same value
Start the server bound to all interfaces on the node so that the login node and your laptop can reach it:
OLLAMA_HOST=0.0.0.0:11434 ollama serve > ~/ollama.log 2>&1 &
The & backgrounds the server and redirects logs to ~/ollama.log.
Leave the shell alive as long as you want the server up.
When you are done, kill %1 (or exit the srun session) stops it.
The default bind is
127.0.0.1:11434, which blocks the SSH tunnel. Always setOLLAMA_HOST=0.0.0.0:11434on the compute node.
From your laptop, open an SSH tunnel that forwards localhost:11434 to the compute node you just captured:
ssh -N -L 11434:c0907a-s23.ufhpc:11434 <gatorlink>@hpg.rc.ufl.edu
Leave that terminal running.
Any request to http://localhost:11434 on your laptop now hits the Ollama server on the HiPerGator GPU node.
Verify the connection with:
curl http://localhost:11434/api/tags
An empty JSON list ({"models":[]}) means the tunnel is good and no models are pulled yet.
Run ollama pull from inside the srun session so the download lands on the compute node and into $OLLAMA_MODELS:
ollama pull llama3.1:8b
ollama pull qwen2.5:7b
ollama pull mistral-nemo:12b
Confirm the cache is writing to the blue allocation and not your home directory:
ollama list
du -sh $OLLAMA_MODELS
realpath $OLLAMA_MODELS # should start with /blue/cis6930
If du reports anything under ~/.ollama, stop the server, fix OLLAMA_MODELS, and rerun the pulls.
MCP tool calling through navigator-cli requires the model to emit OpenAI-style tool_calls.
On the Ollama side, this shows up as the tools capability.
Pulling a model that lacks this capability will run fine for text but will silently ignore --mcp-server, leaving your MCP tools unused.
The table below lists models that currently expose tools and fit on a single A100:
| Model tag | Disk | Context | Good for |
|---|---|---|---|
llama3.1:8b |
~4.7 GB | 128k | General default, solid tool use |
llama3.1:70b |
~40 GB | 128k | Large reasoning, needs 80 GB GPU |
qwen2.5:7b |
~4.4 GB | 128k | Strong tool use, multilingual |
qwen2.5:14b |
~8.2 GB | 128k | Better reasoning, one A100 |
mistral-nemo:12b |
~7.1 GB | 128k | Long documents, tool use |
command-r:35b |
~20 GB | 128k | Long context, RAG-friendly |
Models that do not support tools and will not drive an MCP loop: gemma:2b, gemma2:2b, phi3:mini, llama2, codellama, deepseek-coder:6.7b.
You can still chat with them through navigator-cli, just skip --mcp-server.
Ask Ollama directly, either through the CLI:
ollama show llama3.1:8b | grep -i capabilities
A tool-capable model prints something like capabilities: completion, tools.
The same information is available over HTTP, which is useful from a script on your laptop:
curl -s http://localhost:11434/api/show \
-d '{"name":"llama3.1:8b"}' | jq '.capabilities'
If tools is missing from the list, the model will not round-trip MCP calls.
Switch to one of the tool-capable tags above before wiring it into navigator-cli.
cegme/navigator-cli currently hardcodes the NavigatorAI base URL near the top of navigator_cli.py:
NAVIGATOR_BASE_URL = "https://api.ai.it.ufl.edu/v1"
You have two ways to redirect it at your local Ollama, depending on whether you want to change the CLI itself.
Make the base URL come from an environment variable so the change is reversible:
NAVIGATOR_BASE_URL = os.environ.get(
"NAVIGATOR_BASE_URL",
"https://api.ai.it.ufl.edu/v1",
)
Then drive it from the shell where you want Ollama instead of Navigator:
export NAVIGATOR_BASE_URL=http://localhost:11434/v1
export NAVIGATOR_API_KEY=ollama # any non-empty string; Ollama ignores it
uv run python -m navigator_cli --model llama3.1:8b "Summarize RAG in two sentences."
# MCP still works, because both endpoints speak the same chat.completions schema
uv run python -m navigator_cli \
--model qwen2.5:7b \
--mcp-server mcp_servers/csv_tools.py \
"What is the average score in mcp_servers/sample_data.csv?"
This is the preferred option for the class.
Opening a PR against cegme/navigator-cli with exactly this change helps future students too.
If you do not want to fork navigator-cli, skip the CLI and use the OpenAI SDK the same way the navigatorai-setup guide shows:
from openai import OpenAI
client = OpenAI(
api_key="ollama",
base_url="http://localhost:11434/v1",
)
resp = client.chat.completions.create(
model="llama3.1:8b",
messages=[{"role": "user", "content": "Name three steps in an ETL pipeline."}],
)
print(resp.choices[0].message.content)
Your MCP tool-call loop from navigator-cli/mcp_client.py will run unchanged against this client.
Run these from your laptop with the SSH tunnel open:
# 1. The tunnel sees the server
curl -s http://localhost:11434/api/tags | jq '.models[].name'
# 2. A plain completion works
curl -s http://localhost:11434/v1/chat/completions \
-H 'Content-Type: application/json' \
-d '{
"model": "llama3.1:8b",
"messages": [{"role":"user","content":"Reply with the single word OK."}]
}' | jq -r '.choices[0].message.content'
# 3. navigator-cli reaches it
NAVIGATOR_BASE_URL=http://localhost:11434/v1 \
NAVIGATOR_API_KEY=ollama \
uv run python -m navigator_cli --model llama3.1:8b "Reply OK."
All three steps succeeding means the swap is clean and MCP-ready.
ollama: command not found inside srun. The compute node did not re-read ~/.bashrc. Run source ~/.bashrc or invoke $OLLAMA_HOME/bin/ollama by full path.connect: connection refused from the tunnel. Ollama bound to 127.0.0.1. Restart it with OLLAMA_HOST=0.0.0.0:11434 ollama serve.Disk quota exceeded while pulling a model. OLLAMA_MODELS is not set in the current shell, so Ollama wrote into ~/.ollama. Export the variable, delete ~/.ollama, and pull again.llama3.1:8b instead of :70b) or a quantized tag like llama3.1:8b-instruct-q4_K_M.ollama show <model> | grep tools. If tools is absent, switch to a tool-capable tag from the table above.OLLAMA_HOST=0.0.0.0:11434 ollama serve &, and reopen the tunnel with the new hostname.cegme/navigator-cli reference CLI and MCP clientsrun and partition options