aurora-3.1-fast · ships today · read paper →

Inference, at the edge of light.

Run frontier models 4× faster than the open APIs. Stream tokens in 80ms. Pay only for what you spend, never what we provision.

Get started — free $50 credit $ curl aurora.dev/install.sh | sh

TRUSTED BY

LINEAR vercel FIGMA notion RAYCAST

Live · global

Tokens / sec +12%

142,847

P50 latency healthy

82ms

Active regions

us-east eu-west ap-se

PROTOCOL

Five lines of code. Production-ready.

OpenAI-compatible API. Drop-in replacement. Streaming, function calls, vision — all the things you'd expect, none of the rate limits.

stream.ts · aurora-sdk — 14:23

streaming

// 1. Import the SDK

import { Aurora } from 'aurora-sdk' ;

// 2. One client, every model

const ai = new Aurora ({ apiKey: process.env.AURORA_KEY });

// 3. Stream tokens — same shape as OpenAI

const stream = await ai.chat.stream({

model: 'aurora-3.1-fast',

messages: [{ role: 'user', content: 'Write a haiku about caching.' }]

});

// 4. Token by token, ~80ms first-byte

for await ( const chunk of stream) {

process.stdout.write(chunk.text);

}

→ output stream

Memory keeps the page
Quietly between two thoughts —
Cache hit, breathing slow.

82ms ttft 847 tok/s $0.00012 total

AT SCALE · LAST 30 DAYS

The numbers we're proud of.

requests

2.4B

served · 99.999% SLA

latency

82ms

median ttft · global

savings

74%

cheaper vs OpenAI direct

regions

14

data centers · 6 continents

PLATFORM

Everything you need. Nothing you don't.

01 / MODEL CATALOG

Every frontier model. One endpoint.

FAST

aurora-3.1

847 tok/s

SMART

aurora-3.1-pro

312 tok/s

VISION

aurora-3.1-v

multimodal

EMBED

aurora-embed-2

1536 dim

CODE

aurora-code

256k ctx

+24 more →

02 / PRICING

from $0.20 / 1M tokens

74% under open API price

03 / COMPATIBLE

Drop-in OpenAI replacement.

Change base_url. Ship.

04 / CACHING

Automatic prompt caching

90% cheaper · 4× faster on hits

05 / OBSERVABILITY

Per-request traces

06 / GET STARTED

Free $50 credit.

No card. 30 days.

FROM A USER, UNPROMPTED

"We migrated 14B monthly tokens to Aurora in a single afternoon. Bill dropped 71%. Latency fell to a third. I keep waiting for the catch."

Eleni Vassilakis

CTO · Mantle · ex-OpenAI

Stop renting GPUs. Start shipping intelligence.

Most teams move their first request through Aurora in under 4 minutes. Some do it in 90 seconds.

Get API key — free $50 credit Read the docs first →