diff --git a/README.md b/README.md new file mode 100644 index 0000000..d7d965f --- /dev/null +++ b/README.md @@ -0,0 +1,211 @@ +# ollama-rs + +An async Rust client library for the [Ollama](https://ollama.com/) API. Provides a streaming-first interface for text generation, multi-turn chat, model management, and advanced features like structured output and tool calling. + +## Features + +- Fully async with [tokio](https://tokio.rs/) and streaming responses via `futures::Stream` +- Text generation and multi-turn chat conversations +- Structured JSON output with schema validation +- Tool calling / function calling support +- Model management (list, pull, inspect running models) +- Builder pattern for constructing requests +- Configurable generation parameters (temperature, top-k, top-p, and more) +- Thinking / reasoning mode support + +## Installation + +Add `ollama-rs` to your `Cargo.toml`: + +```toml +[dependencies] +ollama-rs = { git = "https://github.com/andreban/ollama-rs.git" } +tokio = { version = "1", features = ["full"] } +futures-util = "0.3" +``` + +## Prerequisites + +A running [Ollama](https://ollama.com/) server. By default, Ollama listens on `http://localhost:11434`. + +## Quick Start + +### Text Generation + +```rust +use std::io::Write; +use futures_util::StreamExt; +use ollama_rs::{OllamaClient, types::generate::GenerateRequest}; + +#[tokio::main] +async fn main() { + let client = OllamaClient::new("http://localhost:11434"); + let request = GenerateRequest::builder("llama3:8b") + .prompt("Why is the sky blue?") + .build(); + + let mut stream = client.generate(request); + while let Some(response) = stream.next().await { + match response { + Ok(token) => { + print!("{}", token.response); + std::io::stdout().flush().unwrap(); + if token.done { + break; + } + } + Err(e) => eprintln!("Error: {}", e), + } + } +} +``` + +### Chat + +```rust +use std::io::Write; +use futures_util::StreamExt; +use ollama_rs::{OllamaClient, types::chat::{ChatRequest, Message}}; + +#[tokio::main] +async fn main() { + let client = OllamaClient::new("http://localhost:11434"); + let messages = vec![ + Message::system("You are a helpful assistant."), + Message::user("What is the capital of France?"), + ]; + let request = ChatRequest::builder("llama3:8b") + .messages(messages) + .build(); + + let mut stream = client.chat(request); + while let Some(response) = stream.next().await { + let response = response.unwrap(); + print!("{}", response.message.content); + std::io::stdout().flush().unwrap(); + if response.done { + break; + } + } +} +``` + +### Structured Output + +Force the model to respond with JSON matching a specific schema: + +```rust +use ollama_rs::{OllamaClient, types::generate::GenerateRequest}; +use serde_json::json; + +let schema = json!({ + "type": "object", + "properties": { + "answer": { "type": "string" }, + "confidence": { "type": "number" } + } +}); + +let request = GenerateRequest::builder("llama3:8b") + .prompt("What is 2 + 2?") + .stream(false) + .format(schema) + .build(); +``` + +### Tool Calling + +Define tools the model can invoke during a chat conversation: + +```rust +use ollama_rs::types::chat::{ChatRequest, Function, Message, Tool, ToolType}; +use serde_json::json; + +let tools = vec![Tool { + tool_type: ToolType::Function, + function: Function { + name: "get_weather".to_string(), + description: "Get the current weather for a city.".to_string(), + parameters: json!({ + "type": "object", + "properties": { + "city": { "type": "string", "description": "The name of the city" } + }, + "required": ["city"] + }), + }, +}]; + +let request = ChatRequest::builder("llama3:8b") + .messages(vec![Message::user("What is the weather in Paris?")]) + .stream(false) + .tools(tools) + .build(); +``` + +When the model decides to call a tool, the response `message.tool_calls` field will contain the tool name and arguments. You can then execute the function and send the result back via `Message::tool_response(...)`. + +## API Reference + +### `OllamaClient` + +| Method | Description | +|--------|-------------| +| `new(server_address)` | Create a new client pointing at an Ollama server | +| `version()` | Get the Ollama server version | +| `tags()` | List all available models | +| `ps()` | List currently running/loaded models | +| `generate(request)` | Generate text (streaming) | +| `chat(request)` | Chat conversation (streaming) | +| `pull(request)` | Pull/download a model (streaming) | + +### Request Builders + +**`GenerateRequest::builder(model)`** -- `.prompt()`, `.system_prompt()`, `.format()`, `.options()`, `.stream()`, `.think()`, `.images()`, `.suffix()` + +**`ChatRequest::builder(model)`** -- `.messages()`, `.tools()`, `.format()`, `.options()`, `.stream()` + +**`PullRequest::builder(model)`** -- `.stream()` + +### Generation Options + +Configure sampling parameters via `Options::builder()`: + +| Option | Description | +|--------|-------------| +| `temperature(f32)` | Controls randomness (0.0 - 2.0) | +| `top_k(u32)` | Top-K sampling | +| `top_p(f32)` | Nucleus sampling threshold | +| `min_p(f32)` | Minimum probability filter | +| `seed(u64)` | Random seed for reproducibility | +| `num_ctx(u32)` | Context window size | +| `num_predict(u32)` | Maximum tokens to generate | +| `stop(Stop)` | Stop sequences | + +## Examples + +The `examples/` directory contains runnable programs: + +| Example | Description | +|---------|-------------| +| `generate` | Basic text generation | +| `chat` | Interactive multi-turn chat | +| `structured_output` | JSON structured output with schema | +| `tool_call` | Function calling / tool use | +| `pull` | Download a model | +| `tags` | List available models | +| `ps` | List running models | +| `version` | Query server version | + +Run an example: + +```sh +OLLAMA_SERVER=http://localhost:11434 cargo run --example chat +``` + +## Configuration + +| Environment Variable | Description | +|----------------------|-------------| +| `OLLAMA_SERVER` | Ollama server address (e.g., `http://localhost:11434`) | +| `RUST_LOG` | Log level filter (e.g., `ollama_rs=debug`) |