Ollama - Latitude Docs

This integration is only available in the Python SDK.

Overview

This guide shows you how to integrate Latitude Telemetry into an existing application that uses the official Ollama SDK. After completing these steps:

Every Ollama call (e.g. chat, generate) can be captured as a log in Latitude.
Logs are grouped under a prompt, identified by a path, inside a Latitude project.
You can inspect inputs/outputs, measure latency, and debug Ollama-powered features from the Latitude dashboard.

You’ll keep calling Ollama exactly as you do today — Telemetry simply observes and enriches those calls.

Requirements

Before you start, make sure you have:

A Latitude account and API key
A Latitude project ID
A Python-based project that uses the Ollama SDK
A running Ollama instance (local or remote)

That’s it — prompts do not need to be created ahead of time.

Steps

Install requirements

Add the Latitude Telemetry package to your project:

pip install latitude-telemetry

Wrap your Ollama-powered feature

Initialize Latitude Telemetry and wrap the code that calls Ollama using telemetry.capture.You can use the capture method as a decorator (recommended) or as a context manager:

Using decorator (recommended)

import os
import ollama
from latitude_telemetry import Telemetry, Instrumentors, TelemetryOptions

telemetry = Telemetry(
    os.environ["LATITUDE_API_KEY"],
    TelemetryOptions(instrumentors=[Instrumentors.Ollama]),
)

@telemetry.capture(
    project_id=123,  # The ID of your project in Latitude
    path="generate-support-reply",  # Add a path to identify this prompt in Latitude
)
def generate_support_reply(input: str) -> str:
    response = ollama.chat(
        model="llama3.2",
        messages=[{"role": "user", "content": input}],
    )
    return response["message"]["content"]

Using context manager

import os
import ollama
from latitude_telemetry import Telemetry, Instrumentors, TelemetryOptions

telemetry = Telemetry(
    os.environ["LATITUDE_API_KEY"],
    TelemetryOptions(instrumentors=[Instrumentors.Ollama]),
)

def generate_support_reply(input: str) -> str:
    with telemetry.capture(
        project_id=123,  # The ID of your project in Latitude
        path="generate-support-reply",  # Add a path to identify this prompt in Latitude
    ):
        response = ollama.chat(
            model="llama3.2",
            messages=[{"role": "user", "content": input}],
        )
        return response["message"]["content"]

The path:

Identifies the prompt in Latitude
Can be new or existing
Should not contain spaces or special characters (use letters, numbers, - _ / .)

Streaming responses

When using streaming (stream=True), use a generator function with the decorator. The SDK keeps the span open until all chunks are yielded:

@telemetry.capture(project_id=123, path="generate-support-reply")
async def stream_support_reply(input: str):
    stream = ollama.chat(
        model="llama3.2",
        messages=[{"role": "user", "content": input}],
        stream=True,
    )
    for chunk in stream:
        if chunk["message"]["content"]:
            yield chunk["message"]["content"]

Seeing your logs in Latitude

Once your feature is wrapped, logs will appear automatically.

Open the prompt in your Latitude dashboard (identified by path)
Go to the Traces section
Each execution will show:
- Input and output messages
- Model and token usage
- Latency and errors
- One trace per feature invocation

Each Ollama call appears as a child span under the captured prompt execution, giving you a full, end-to-end view of what happened.

That’s it

No changes to your Ollama calls, no special return values, and no extra plumbing — just wrap the feature you want to observe.

​Overview

​Requirements

​Steps

​Streaming responses

​Seeing your logs in Latitude

​That’s it

Overview

Requirements

Steps

Streaming responses

Seeing your logs in Latitude

That’s it