FREYA API

Cybernethicc FREYA

freya.cybernethicc.com

FREYA is an AI inference API that provides access to multiple model tiers through a single, OpenAI-compatible endpoint. Use it to add AI capabilities to your applications with simple HTTP requests.

OpenAI-compatible

Drop-in replacement using the standard /v1/chat/completions format.

Three model tiers

Nano, Pro, and Big Brain for different use cases and budgets.

Pay-per-token

Billed from your billing account balance with no minimum commitment.

Streaming support

Server-sent events (SSE) for real-time token delivery.

Base URL

https://freya.cybernethicc.com/v1

Content Type

application/json

Getting Started

Set up your account and make your first AI request.

Create an account

Top up your balance

Add funds to your billing account. FREYA usage is deducted from your account balance in real time.

Generate an API key

Navigate to Services > FREYA > API Keys in the portal. Click Create Key and save the key securely -- it is shown only once.

Make your first request

Use the API key as a Bearer token and call the chat completions endpoint.

Authentication

All inference requests are authenticated with your FREYA API key. Include it as a Bearer token in the Authorization header.

http

Authorization: Bearer FREYA-YOUR_API_KEY

Keep your API key secret

Do not expose API keys in client-side code or public repositories. API keys are shown only once when created. If compromised, revoke it immediately in the portal and create a new one.

API keys are prefixed with FREYA-. Requests with missing, invalid, or revoked keys receive a 401 Unauthorized response.

Model Tiers

FREYA offers three model tiers. Choose the right tier based on your task complexity and budget requirements.

FREYA Nano

freya-nano

Fast and affordable. Optimized for high-throughput tasks like chat, Q&A, summarization, and simple generation.

Lowest latency (~200ms)
Cost-effective for high volume
Great for chatbots and simple Q&A

Rp 10,000 / 1M tokens

FREYA Pro

freya-pro

Balanced quality and speed. Great for coding assistance, content generation, data analysis, and structured output.

Balanced speed and quality
Excellent for code generation
Strong analytical capabilities

Rp 25,000 / 1M tokens

FREYA Big Brain

freya-bigbrain

Most capable model. Designed for complex reasoning, deep analysis, research, and tasks requiring nuanced understanding.

Superior reasoning ability
Best for complex analysis
Highest output quality

Rp 750,000 / 1M tokens

List Models

Retrieve the list of available models. No authentication required.

http

GET https://freya.cybernethicc.com/v1/models

Chat Completions

Create a chat completion by sending a conversation history to the model. The API follows the OpenAI chat completions format.

POSThttps://freya.cybernethicc.com/v1/chat/completions

Request Body

Parameter	Type	Required	Description
model	string	Required	Model tier: `freya-nano`, `freya-pro`, or `freya-bigbrain`
messages	array	Required	Array of message objects with `role` (system, user, assistant) and `content`
temperature	number	Optional	Sampling temperature between 0 and 2. Default: 1.0
max_tokens	integer	Optional	Maximum number of tokens to generate. Default: 4096
stream	boolean	Optional	If true, returns a stream of server-sent events. Default: false
top_p	number	Optional	Nucleus sampling. Use either temperature or top_p, not both.

Response

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1711680000,
  "model": "freya-pro",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Quantum computing uses quantum bits (qubits) that can exist in multiple states simultaneously..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 32,
    "completion_tokens": 128,
    "total_tokens": 160
  }
}

Streaming

Set stream: true to receive tokens as they are generated via server-sent events (SSE). Each event contains a JSON chunk with a delta field.

sse

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1711680000,"model":"freya-pro","choices":[{"index":0,"delta":{"role":"assistant","content":"Quantum"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1711680000,"model":"freya-pro","choices":[{"index":0,"delta":{"content":" computing"},"finish_reason":null}]}

data: [DONE]

The stream ends with a data: [DONE] message. The OpenAI Python and Node.js SDKs handle streaming automatically.

Code Examples

Basic Request

bash

curl https://freya.cybernethicc.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer FREYA-YOUR_API_KEY" \
  -d '{
    "model": "freya-pro",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Explain quantum computing in simple terms."}
    ],
    "temperature": 0.7,
    "max_tokens": 1024
  }'

The Python and Node.js examples use the official openai SDK. Install with pip install openai or npm install openai. The Go example uses github.com/openai/openai-go.

Streaming Example

python

from openai import OpenAI

client = OpenAI(
    api_key="FREYA-YOUR_API_KEY",
    base_url="https://freya.cybernethicc.com/v1",
)

stream = client.chat.completions.create(
    model="freya-pro",
    messages=[
        {"role": "user", "content": "Write a short poem about the ocean."},
    ],
    stream=True,
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="", flush=True)

Rate Limits

Rate limits are applied per API key. The default limit is 60 requests per minute. When exceeded, the API returns a 429 Too Many Requests response.

Header	Description
X-RateLimit-Limit	Maximum requests per minute for this key
X-RateLimit-Remaining	Remaining requests in the current window
X-RateLimit-Reset	Unix timestamp when the rate limit resets

Need higher limits? Contact support or upgrade to GPU Dedicated.

Pricing

FREYA uses token-based pricing, billed per 1 million tokens. Usage is deducted from your billing account balance in real time after each request.

Model Tier	Model ID	Price per 1M Tokens
FREYA Nano	freya-nano	Rp 10,000
FREYA Pro	freya-pro	Rp 25,000
FREYA Big Brain	freya-bigbrain	Rp 750,000

How billing works

Both input (prompt) and output (completion) tokens are counted toward the total. After each request, the cost is calculated and deducted from your billing account balance. If your balance reaches zero, requests will return a 402 Payment Required error.

GPU Dedicated

Dedicated GPU Instances

For high-volume or latency-sensitive workloads, FREYA offers dedicated GPU instances. Get guaranteed throughput, lower latency, and custom model deployments on reserved hardware.

Guaranteed compute with no shared contention
Custom model fine-tuning and deployment
Higher rate limits and priority support
Monthly billing with volume discounts

Contact Sales

Error Handling

The API returns standard HTTP status codes and a JSON error body.

Error Response Format

Response

{
  "error": {
    "message": "Invalid or revoked API key",
    "type": "authentication_error"
  }
}

Status	Type	Description
400	invalid_request	Missing required parameters or invalid model name
401	authentication_error	Missing, invalid, or revoked API key
402	insufficient_balance	Billing account balance is zero. Top up to continue.
429	rate_limit_exceeded	Too many requests. Wait and retry.
500	provider_error	Internal server or upstream error. Retry with exponential backoff.

Back to Documentation

Cybernethicc FREYA API v1

cybernethicc-terminal

Booting neural core...

Cybernethicc FREYA

freya.cybernethicc.com

FREYA is an AI inference API that provides access to multiple model tiers through a single, OpenAI-compatible endpoint. Use it to add AI capabilities to your applications with simple HTTP requests.

OpenAI-compatible

Drop-in replacement using the standard /v1/chat/completions format.

Three model tiers

Nano, Pro, and Big Brain for different use cases and budgets.

Pay-per-token

Billed from your billing account balance with no minimum commitment.

Streaming support

Server-sent events (SSE) for real-time token delivery.

Base URL

https://freya.cybernethicc.com/v1

Content Type

application/json

Getting Started

Set up your account and make your first AI request.

Create an account

Top up your balance

Add funds to your billing account. FREYA usage is deducted from your account balance in real time.

Generate an API key

Navigate to Services > FREYA > API Keys in the portal. Click Create Key and save the key securely -- it is shown only once.

Make your first request

Use the API key as a Bearer token and call the chat completions endpoint.

Authentication

All inference requests are authenticated with your FREYA API key. Include it as a Bearer token in the Authorization header.

http

Authorization: Bearer FREYA-YOUR_API_KEY

Keep your API key secret

Do not expose API keys in client-side code or public repositories. API keys are shown only once when created. If compromised, revoke it immediately in the portal and create a new one.

API keys are prefixed with FREYA-. Requests with missing, invalid, or revoked keys receive a 401 Unauthorized response.

Model Tiers

FREYA offers three model tiers. Choose the right tier based on your task complexity and budget requirements.

FREYA Nano

freya-nano

Fast and affordable. Optimized for high-throughput tasks like chat, Q&A, summarization, and simple generation.

Lowest latency (~200ms)
Cost-effective for high volume
Great for chatbots and simple Q&A

Rp 10,000 / 1M tokens

FREYA Pro

freya-pro

Balanced quality and speed. Great for coding assistance, content generation, data analysis, and structured output.

Balanced speed and quality
Excellent for code generation
Strong analytical capabilities

Rp 25,000 / 1M tokens

FREYA Big Brain

freya-bigbrain

Most capable model. Designed for complex reasoning, deep analysis, research, and tasks requiring nuanced understanding.

Superior reasoning ability
Best for complex analysis
Highest output quality

Rp 750,000 / 1M tokens

List Models

Retrieve the list of available models. No authentication required.

http

GET https://freya.cybernethicc.com/v1/models

Chat Completions

Create a chat completion by sending a conversation history to the model. The API follows the OpenAI chat completions format.

POSThttps://freya.cybernethicc.com/v1/chat/completions

Request Body

Parameter	Type	Required	Description
model	string	Required	Model tier: `freya-nano`, `freya-pro`, or `freya-bigbrain`
messages	array	Required	Array of message objects with `role` (system, user, assistant) and `content`
temperature	number	Optional	Sampling temperature between 0 and 2. Default: 1.0
max_tokens	integer	Optional	Maximum number of tokens to generate. Default: 4096
stream	boolean	Optional	If true, returns a stream of server-sent events. Default: false
top_p	number	Optional	Nucleus sampling. Use either temperature or top_p, not both.

Response

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1711680000,
  "model": "freya-pro",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Quantum computing uses quantum bits (qubits) that can exist in multiple states simultaneously..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 32,
    "completion_tokens": 128,
    "total_tokens": 160
  }
}

Streaming

Set stream: true to receive tokens as they are generated via server-sent events (SSE). Each event contains a JSON chunk with a delta field.

sse

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1711680000,"model":"freya-pro","choices":[{"index":0,"delta":{"role":"assistant","content":"Quantum"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1711680000,"model":"freya-pro","choices":[{"index":0,"delta":{"content":" computing"},"finish_reason":null}]}

data: [DONE]

The stream ends with a data: [DONE] message. The OpenAI Python and Node.js SDKs handle streaming automatically.

Code Examples

Basic Request

bash

curl https://freya.cybernethicc.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer FREYA-YOUR_API_KEY" \
  -d '{
    "model": "freya-pro",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Explain quantum computing in simple terms."}
    ],
    "temperature": 0.7,
    "max_tokens": 1024
  }'

The Python and Node.js examples use the official openai SDK. Install with pip install openai or npm install openai. The Go example uses github.com/openai/openai-go.

Streaming Example

python

from openai import OpenAI

client = OpenAI(
    api_key="FREYA-YOUR_API_KEY",
    base_url="https://freya.cybernethicc.com/v1",
)

stream = client.chat.completions.create(
    model="freya-pro",
    messages=[
        {"role": "user", "content": "Write a short poem about the ocean."},
    ],
    stream=True,
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="", flush=True)

Rate Limits

Rate limits are applied per API key. The default limit is 60 requests per minute. When exceeded, the API returns a 429 Too Many Requests response.

Header	Description
X-RateLimit-Limit	Maximum requests per minute for this key
X-RateLimit-Remaining	Remaining requests in the current window
X-RateLimit-Reset	Unix timestamp when the rate limit resets

Need higher limits? Contact support or upgrade to GPU Dedicated.

Pricing

FREYA uses token-based pricing, billed per 1 million tokens. Usage is deducted from your billing account balance in real time after each request.

Model Tier	Model ID	Price per 1M Tokens
FREYA Nano	freya-nano	Rp 10,000
FREYA Pro	freya-pro	Rp 25,000
FREYA Big Brain	freya-bigbrain	Rp 750,000

How billing works

GPU Dedicated

Dedicated GPU Instances

For high-volume or latency-sensitive workloads, FREYA offers dedicated GPU instances. Get guaranteed throughput, lower latency, and custom model deployments on reserved hardware.

Guaranteed compute with no shared contention
Custom model fine-tuning and deployment
Higher rate limits and priority support
Monthly billing with volume discounts

Contact Sales

Error Handling

The API returns standard HTTP status codes and a JSON error body.

Error Response Format

Response

{
  "error": {
    "message": "Invalid or revoked API key",
    "type": "authentication_error"
  }
}

Status	Type	Description
400	invalid_request	Missing required parameters or invalid model name
401	authentication_error	Missing, invalid, or revoked API key
402	insufficient_balance	Billing account balance is zero. Top up to continue.
429	rate_limit_exceeded	Too many requests. Wait and retry.
500	provider_error	Internal server or upstream error. Retry with exponential backoff.

Back to Documentation

Cybernethicc FREYA API v1