Implement the Ollama POST /api/embed endpoint for generating vector
embeddings from text input.
- Add EmbedInput, EmbedRequest, EmbedResponse types in src/types/embed.rs
- Add OllamaClient::embed() async method in src/lib.rs
- Register embed module in src/types/mod.rs
- Add usage example in examples/embed.rs
- Update README with embed endpoint documentation
OllamaClient now applies a 30-second connection timeout by default,
so a down server fails fast instead of blocking indefinitely. No
request timeout is set since LLM responses can legitimately run for
minutes during model loading or long generations.
Added OllamaClient::builder() for custom configuration:
OllamaClient::builder("http://localhost:11434")
.connection_timeout(Duration::from_secs(60))
.build();
Also updated README.md to document the builder API, default()
constructor, tool_response return type change, and think support
in ChatRequest.