Go to file

André Cipriani Bandarra d845103679 Fix Display impl for OllamaError: use write! instead of writeln!

The Display trait should not append trailing newlines, as callers
like println! and error chaining libraries (anyhow, eyre) add their
own. Also cleaned up the display strings to use readable names
instead of raw variant names.

2026-01-30 19:29:29 +00:00

examples

More logging

2026-01-15 20:47:19 +00:00

src

Fix Display impl for OllamaError: use write! instead of writeln!

2026-01-30 19:29:29 +00:00

.gitignore

Adds tags and ps methods

2025-12-23 22:04:38 +00:00

Cargo.lock

Update request to 0.13

2026-01-30 19:16:15 +00:00

Cargo.toml

Update request to 0.13

2026-01-30 19:16:15 +00:00

README.md

Adds README.md

2026-01-30 19:13:00 +00:00

README.md

ollama-rs

An async Rust client library for the Ollama API. Provides a streaming-first interface for text generation, multi-turn chat, model management, and advanced features like structured output and tool calling.

Features

Fully async with tokio and streaming responses via futures::Stream
Text generation and multi-turn chat conversations
Structured JSON output with schema validation
Tool calling / function calling support
Model management (list, pull, inspect running models)
Builder pattern for constructing requests
Configurable generation parameters (temperature, top-k, top-p, and more)
Thinking / reasoning mode support

Installation

Add ollama-rs to your Cargo.toml:

[dependencies]
ollama-rs = { git = "https://github.com/andreban/ollama-rs.git" }
tokio = { version = "1", features = ["full"] }
futures-util = "0.3"

Prerequisites

A running Ollama server. By default, Ollama listens on http://localhost:11434.

Quick Start

Text Generation

use std::io::Write;
use futures_util::StreamExt;
use ollama_rs::{OllamaClient, types::generate::GenerateRequest};

#[tokio::main]
async fn main() {
    let client = OllamaClient::new("http://localhost:11434");
    let request = GenerateRequest::builder("llama3:8b")
        .prompt("Why is the sky blue?")
        .build();

    let mut stream = client.generate(request);
    while let Some(response) = stream.next().await {
        match response {
            Ok(token) => {
                print!("{}", token.response);
                std::io::stdout().flush().unwrap();
                if token.done {
                    break;
                }
            }
            Err(e) => eprintln!("Error: {}", e),
        }
    }
}

Chat

use std::io::Write;
use futures_util::StreamExt;
use ollama_rs::{OllamaClient, types::chat::{ChatRequest, Message}};

#[tokio::main]
async fn main() {
    let client = OllamaClient::new("http://localhost:11434");
    let messages = vec![
        Message::system("You are a helpful assistant."),
        Message::user("What is the capital of France?"),
    ];
    let request = ChatRequest::builder("llama3:8b")
        .messages(messages)
        .build();

    let mut stream = client.chat(request);
    while let Some(response) = stream.next().await {
        let response = response.unwrap();
        print!("{}", response.message.content);
        std::io::stdout().flush().unwrap();
        if response.done {
            break;
        }
    }
}

Structured Output

Force the model to respond with JSON matching a specific schema:

use ollama_rs::{OllamaClient, types::generate::GenerateRequest};
use serde_json::json;

let schema = json!({
    "type": "object",
    "properties": {
        "answer": { "type": "string" },
        "confidence": { "type": "number" }
    }
});

let request = GenerateRequest::builder("llama3:8b")
    .prompt("What is 2 + 2?")
    .stream(false)
    .format(schema)
    .build();

Tool Calling

Define tools the model can invoke during a chat conversation:

use ollama_rs::types::chat::{ChatRequest, Function, Message, Tool, ToolType};
use serde_json::json;

let tools = vec![Tool {
    tool_type: ToolType::Function,
    function: Function {
        name: "get_weather".to_string(),
        description: "Get the current weather for a city.".to_string(),
        parameters: json!({
            "type": "object",
            "properties": {
                "city": { "type": "string", "description": "The name of the city" }
            },
            "required": ["city"]
        }),
    },
}];

let request = ChatRequest::builder("llama3:8b")
    .messages(vec![Message::user("What is the weather in Paris?")])
    .stream(false)
    .tools(tools)
    .build();

When the model decides to call a tool, the response message.tool_calls field will contain the tool name and arguments. You can then execute the function and send the result back via Message::tool_response(...).

API Reference

`OllamaClient`

Method	Description
`new(server_address)`	Create a new client pointing at an Ollama server
`version()`	Get the Ollama server version
`tags()`	List all available models
`ps()`	List currently running/loaded models
`generate(request)`	Generate text (streaming)
`chat(request)`	Chat conversation (streaming)
`pull(request)`	Pull/download a model (streaming)

Request Builders

GenerateRequest::builder(model) -- .prompt(), .system_prompt(), .format(), .options(), .stream(), .think(), .images(), .suffix()

ChatRequest::builder(model) -- .messages(), .tools(), .format(), .options(), .stream()

PullRequest::builder(model) -- .stream()

Generation Options

Configure sampling parameters via Options::builder():

Option	Description
`temperature(f32)`	Controls randomness (0.0 - 2.0)
`top_k(u32)`	Top-K sampling
`top_p(f32)`	Nucleus sampling threshold
`min_p(f32)`	Minimum probability filter
`seed(u64)`	Random seed for reproducibility
`num_ctx(u32)`	Context window size
`num_predict(u32)`	Maximum tokens to generate
`stop(Stop)`	Stop sequences

Examples

The examples/ directory contains runnable programs:

Example	Description
`generate`	Basic text generation
`chat`	Interactive multi-turn chat
`structured_output`	JSON structured output with schema
`tool_call`	Function calling / tool use
`pull`	Download a model
`tags`	List available models
`ps`	List running models
`version`	Query server version

Run an example:

OLLAMA_SERVER=http://localhost:11434 cargo run --example chat

Configuration

Environment Variable	Description
`OLLAMA_SERVER`	Ollama server address (e.g., `http://localhost:11434`)
`RUST_LOG`	Log level filter (e.g., `ollama_rs=debug`)