Serverless · Cloudflare Workers · D1

One API. Seven AI models.
Zero complexity.

A production-ready REST orchestration layer that unifies Claude, Gemini, GPT-4.1, Mistral, Grok, Llama, and Perplexity behind a single endpoint — with fallback, memory, search, and cost tracking built in.

Get the code View features
Claude Gemini GPT-4.1 Mistral Grok Llama / Groq Perplexity

Everything you need. Nothing you don't.

Built on Cloudflare Workers and D1 — globally distributed, serverless, and ready for production from day one.

Unified endpoint

One POST /chat call to reach any of the seven models. Switch providers by changing a single parameter.

🔄

Automatic fallback

If the primary model fails, the system transparently retries with the next provider in the chain — no downtime, no errors.

🧠

Session memory

Each conversation session persists its last 10 turns in Cloudflare D1. Context follows the user across requests.

🔍

Web search

Enable real-time search on any request with search: true. Powered by Tavily — up to 5 agentic tool calls per response.

💰

Cost tracking

Every call logs token counts and USD cost per model, per session, and per API key. Query totals via GET /usage.

🛡️

EU / US data sovereignty

All integrated providers are hosted in the United States or European Union. No Chinese-jurisdiction infrastructure in the chain.

One request.
Any model.

Switch between providers without changing your integration. The orchestrator handles routing, fallback, and context automatically.

curl · aiorchestrator.gntkh.com
curl -X POST \
  https://aiorchestrator.gntkh.com/chat \
  -H "Authorization: Bearer orc_..." \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "What happened in markets today?",
    "modelTarget": "perplexity",
    "search": true,
    "session_id": "my-session-001"
  }'

# Response includes citations, token counts,
# USD cost, and fallback info if triggered.

Pay once. Own your infrastructure.

No subscriptions. You get the source code — deploy it yourself on Cloudflare with your own API keys.

Enterprise

Multi-tenant · custom branding · SLA

Custom
Contact us for a quote
  • Everything in Starter
  • Multi-tenant architecture (one worker, multiple clients)
  • Admin dashboard with billing per consumption
  • White-label branding
  • SLA + priority support
  • Custom model integrations on request
Contact us