← Back to All Products
🤖 Self-Hosted AI Platform

iNdex .ai

Your data never leaves your servers.
Your intelligence. Your infrastructure. Your rules.

Llama 3.1
CodeLlama
Mistral
Phi-3
🔒 100% LOCAL — ZERO CLOUD
llama3.1:8b
0+Models Supported
0msAvg First Token
0KContext Window
0%Data Privacy

Neural Network Ecosystem

Swap between frontier open-source models in real time. No API keys. No cloud dependencies.

Clean Architecture

Four-layer design. Blazor frontend, CQRS application layer, pure domain, and Ollama-powered infrastructure.

01
Presentation
Blazor Server + MudBlazor UI components and REST API endpoints
Blazor MudBlazor JWT
02
Application
CQRS pattern. Commands, queries, handlers and business logic via MediatR
CQRS MediatR DTOs
03
Domain
Entities, interfaces, pure domain logic. Zero external dependencies
Entities Interfaces .NET 8
04
Infrastructure
Ollama AI engine, PostgreSQL, file system. Repository implementations
Ollama PostgreSQL Docker

Zero External Calls

Air-Gapped Operation
Runs fully offline. No internet required after installation.
Your Infrastructure
Deploy on-premise, private cloud, or bare metal GPU server.
GDPR & HIPAA Ready
Full compliance. Data never crosses organizational boundaries.
No Vendor Lock-in
Open standards. Swap models freely. Own your AI stack forever.

Raw Performance

Measured on consumer-grade NVIDIA GPU. Results vary by hardware configuration.

Response Latency
0ms
avg first-token latency — local GPU
Tokens / Second
0
avg generation speed — 8B model
Context Window
0K
max context tokens — extended mode

Real-Time Token Streaming

iNdex.ai — Streaming API Demo
# POST /api/chat/stream
curl -X POST http://localhost:5000/api/chat/stream \   -H "Content-Type: application/json" \   -d '{"model":"llama3.1:8b","prompt":"Explain neural networks"}'
↓ streaming response ...
Server-Sent Events
See each token as it's generated. No waiting for full completion. Native SSE protocol.
SSE
REST + WebSocket
Both sync and streaming endpoints. Integrate with any client in any language.
REST
Semantic Kernel
Microsoft's AI orchestration. Build agents, chains, and plugins natively in .NET.
SK

One Command Deploy

Three paths to production. Docker, native .NET, or GPU-accelerated bare metal.

DOCKER
Docker Compose
Spin up the full stack in one command. Postgres, Ollama, and the API fully orchestrated.
$ docker-compose up -d
.NET 8
Native .NET
ASP.NET Core Minimal APIs. Fast, modern, cross-platform across Linux, Windows, and macOS.
$ dotnet run --project iNdex.Api
NVIDIA GPU
GPU Accelerated
NVIDIA CUDA via Ollama + LLamaSharp. 16GB RAM minimum. GPU optional but recommended.
$ ollama pull llama3.1:8b

Own Your AI.
Own Your Future.

Self-hosted. Privacy-first. Enterprise-ready. iNdex.ai puts the power of large language models entirely within your control.