OllamaFlow.Core 1.0.2

There is a newer version of this package available.
See the version list below for details.

dotnet add package OllamaFlow.Core --version 1.0.2

NuGet\Install-Package OllamaFlow.Core -Version 1.0.2

This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.

<PackageReference Include="OllamaFlow.Core" Version="1.0.2" />

For projects that support PackageReference, copy this XML node into the project file to reference the package.

<PackageVersion Include="OllamaFlow.Core" Version="1.0.2" />
                    

                            Directory.Packages.props

<PackageReference Include="OllamaFlow.Core" />
                    

                            Project file

For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.

paket add OllamaFlow.Core --version 1.0.2

The NuGet Team does not provide support for this client. Please contact its maintainers for support.

#r "nuget: OllamaFlow.Core, 1.0.2"

#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.

#:package OllamaFlow.Core@1.0.2

#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.

#addin nuget:?package=OllamaFlow.Core&version=1.0.2
                    

                            Install as a Cake Addin

#tool nuget:?package=OllamaFlow.Core&version=1.0.2
                    

                            Install as a Cake Tool

The NuGet Team does not provide support for this client. Please contact its maintainers for support.

OllamaFlow

Intelligent Load Balancing and Model Orchestration for Ollama

</div>

🚀 Scale Your Ollama Infrastructure

OllamaFlow is a lightweight, intelligent orchestration layer that transforms multiple Ollama instances into a unified, high-availability AI inference cluster. Whether you're scaling AI workloads across multiple GPUs or ensuring zero-downtime model serving, OllamaFlow has you covered.

Why OllamaFlow?

🎯 Multiple Virtual Endpoints: Create multiple frontend endpoints, each mapping to their own set of Ollama backends
⚖️ Smart Load Balancing: Distribute requests intelligently across healthy backends
🔄 Automatic Model Sync: Ensure all backends have the required models - automatically
❤️ Health Monitoring: Real-time health checks with configurable thresholds
📊 Zero Downtime: Seamlessly handle backend failures without dropping requests
🛠️ RESTful Admin API: Full control through a comprehensive management API

🎨 Key Features

Load Balancing

Round-robin and random distribution strategies
Request routing based on backend health and capacity
Automatic failover for unhealthy backends
Configurable rate limiting per backend

Model Management

Automatic model discovery across all backends
Intelligent synchronization - pulls missing models automatically
Dynamic model requirements - update required models on the fly
Parallel downloads with configurable concurrency

High Availability

Real-time health monitoring with customizable check intervals
Automatic failover for unhealthy backends
Request queuing during high load
Connection pooling for optimal performance

Enterprise Ready

Bearer token authentication for admin APIs
Comprehensive logging with syslog support
Docker and Docker Compose ready
SQLite database for configuration persistence

🏃 Quick Start

Using Docker (Recommended)

# Pull the image
docker pull jchristn/ollamaflow

# Run with default configuration
docker run -d \
  -p 43411:43411 \
  -v $(pwd)/ollamaflow.json:/app/ollamaflow.json \
  jchristn/ollamaflow

Using .NET

# Clone the repository
git clone https://github.com/jchristn/ollamaflow.git
cd ollamaflow/src

# Build and run
dotnet build
cd OllamaFlow.Server/bin/Debug/net8.0
dotnet OllamaFlow.Server.dll

⚙️ Configuration

OllamaFlow uses a simple JSON configuration file. Here's a minimal example:

{
  "Webserver": {
    "Hostname": "localhost",
    "Port": 43411
  },
  "Logging": {
    "MinimumSeverity": "Info",
    "ConsoleLogging": true
  }
}

Frontend Configuration

Frontends define your virtual Ollama endpoints:

{
  "Identifier": "main-frontend",
  "Name": "Production Ollama Frontend",
  "Hostname": "*",
  "LoadBalancing": "RoundRobin",
  "Backends": ["gpu-1", "gpu-2", "gpu-3"],
  "RequiredModels": ["llama3", "mistral", "codellama"]
}

Backend Configuration

Backends represent your actual Ollama instances:

{
  "Identifier": "gpu-1",
  "Name": "GPU Server 1",
  "Hostname": "192.168.1.100",
  "Port": 11434,
  "MaxParallelRequests": 4,
  "HealthCheckUrl": "/",
  "UnhealthyThreshold": 2
}

📡 API Compatibility

OllamaFlow is fully compatible with the Ollama API, supporting:

✅ /api/generate - Text generation
✅ /api/chat - Chat completions
✅ /api/pull - Model pulling
✅ /api/push - Model pushing
✅ /api/show - Model information
✅ /api/tags - List models
✅ /api/ps - Running models
✅ /api/embed - Embeddings
✅ /api/delete - Model deletion

🔧 Advanced Features

Multi-Node Testing

Test with multiple Ollama instances using Docker Compose:

cd Docker
docker compose -f compose-ollama.yaml up -d

This spins up 4 Ollama instances on ports 11435-11438 for testing.

Admin API

Manage your cluster programmatically:

# List all backends
curl -H "Authorization: Bearer your-token" \
  http://localhost:43411/v1.0/backends

# Add a new backend
curl -X PUT \
  -H "Authorization: Bearer your-token" \
  -H "Content-Type: application/json" \
  -d '{"Identifier": "gpu-4", "Hostname": "192.168.1.104", "Port": 11434}' \
  http://localhost:43411/v1.0/backends

A complete Postman collection (OllamaFlow.postman_collection.json) is included in the repository root with examples for all API endpoints, both Ollama-compatible and administrative APIs.

🤝 Contributing

We welcome contributions! Whether it's:

🐛 Bug fixes
✨ New features
📚 Documentation improvements
💡 Feature requests

Please check out our Contributing Guidelines and feel free to:

Fork the repository
Create your feature branch (git checkout -b feature/AmazingFeature)
Commit your changes (git commit -m 'Add some AmazingFeature')
Push to the branch (git push origin feature/AmazingFeature)
Open a Pull Request

📊 Performance

OllamaFlow adds minimal overhead to your Ollama requests:

< 1ms routing decision time
Negligible memory footprint (~50MB)
High throughput - handles thousands of requests per second
Efficient streaming support for real-time responses

🛡️ Security

Bearer token authentication for administrative APIs
Request source IP forwarding for audit trails
Configurable request size limits
No external dependencies for core functionality

🌟 Use Cases

GPU Cluster Management: Distribute AI workloads across multiple GPU servers
CPU Infrastructure: Perfect for dense CPU systems like Ampere processors
High Availability: Ensure your AI services stay online 24/7
Development & Testing: Easily switch between different model configurations
Cost Optimization: Maximize hardware utilization across your infrastructure
Multi-Tenant Scenarios: Isolate workloads while sharing infrastructure

📜 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

The Ollama team for creating an amazing local AI runtime
All our contributors and users who make this project possible

<div align="center"> <b>Ready to scale your AI infrastructure?</b><br> Get started with OllamaFlow today! </div>

Product	Compatible and additional computed target framework versions.
.NET	net8.0 is compatible. net8.0-android was computed. net8.0-browser was computed. net8.0-ios was computed. net8.0-maccatalyst was computed. net8.0-macos was computed. net8.0-tvos was computed. net8.0-windows was computed. net9.0 was computed. net9.0-android was computed. net9.0-browser was computed. net9.0-ios was computed. net9.0-maccatalyst was computed. net9.0-macos was computed. net9.0-tvos was computed. net9.0-windows was computed. net10.0 was computed. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed.

Product

.NET

Compatible target framework(s)

Included target framework(s) (in package)

Learn more about Target Frameworks and .NET Standard.

net8.0
- ExpressionTree (>= 1.1.2)
- RestWrapper (>= 3.1.5)
- SyslogLogging (>= 2.0.8)
- Watson (>= 6.3.10)
- WatsonORM.Sqlite (>= 3.0.14)

NuGet packages

This package is not used by any NuGet packages.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version	Downloads	Last Updated
1.1.3	176	10/8/2025
1.1.0	151	10/3/2025
1.0.5	174	9/30/2025
1.0.3	283	9/19/2025
1.0.2	163	9/5/2025
1.0.1	187	9/4/2025

Initial release.