API Documentation

Complete reference for integrating with alAPI's LLM, OCR, and Retrieval services

v1 Base URL: https://alapi.deep.sa/v1

OpenAI Compatible Use the official OpenAI SDK with our base URL. Drop-in replacement for existing applications.

LLM API

OpenAI-compatible API for chat completions and embeddings. Use your favorite models through a unified interface.

Authentication

All API requests require authentication using a Bearer token in the Authorization header.

Request Header:

header
Authorization: Bearer YOUR_API_KEY

API Key: Generate API keys from your Dashboard.

SDK Setup Example:

main.py
from openai import OpenAI

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://alapi.deep.sa/v1"
)

Chat Completions

Create Chat Completion

Creates a model response for the given chat conversation

POST

Endpoint:

endpoint
https://alapi.deep.sa/v1/chat/completions

Request Body:

request.json
{
  "model": "llama-3.3-70b-versatile",
  "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Hello!"}
  ],
  "temperature": 0.7,
  "max_tokens": 1024,
  "stream": false
}

Parameters:

Name Type Required Description
model string Yes ID of the model to use
messages array Yes Array of message objects with role and content
temperature number No Sampling temperature (0-2). Default: 1
max_tokens integer No Maximum tokens to generate
stream boolean No If true, returns a stream of events
top_p number No Nucleus sampling parameter. Default: 1

Response (200 OK):

response.json
{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1706745000,
  "model": "llama-3.3-70b-versatile",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I help you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 20,
    "completion_tokens": 10,
    "total_tokens": 30
  }
}

Code Examples:

main.py
from openai import OpenAI

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://alapi.deep.sa/v1"
)

response = client.chat.completions.create(
    model="llama-3.3-70b-versatile",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What is the capital of Saudi Arabia?"}
    ],
    temperature=0.7,
    max_tokens=1024
)

print(response.choices[0].message.content)

Streaming Responses

Server-Sent Events (SSE)

Stream responses token by token for real-time output

SSE

How it works: Set stream: true in your request. The response will be sent as Server-Sent Events, with each chunk containing a delta of the response content.

Code Examples:

main.py
from openai import OpenAI

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://alapi.deep.sa/v1"
)

stream = client.chat.completions.create(
    model="llama-3.3-70b-versatile",
    messages=[
        {"role": "user", "content": "Write a short poem about coding"}
    ],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="", flush=True)

Embeddings

Create Embeddings

Creates an embedding vector representing the input text

POST

Endpoint:

endpoint
https://alapi.deep.sa/v1/embeddings

Request Body:

request.json
{
  "model": "deep-sa/alEmbedding",
  "input": "The quick brown fox jumps over the lazy dog"
}

Parameters:

Name Type Required Description
model string Yes ID of the embedding model to use
input string | array Yes Text to embed. Can be a string or array of strings
encoding_format string No Format for the embeddings: 'float' or 'base64'. Default: float
dimensions integer No Number of dimensions for the output embeddings (model-dependent)

Response (200 OK):

response.json
{
  "object": "list",
  "data": [
    {
      "object": "embedding",
      "index": 0,
      "embedding": [0.0023064255, -0.009327292, ...]
    }
  ],
  "model": "deep-sa/alEmbedding",
  "usage": {
    "prompt_tokens": 9,
    "total_tokens": 9
  }
}

Code Examples:

main.py
from openai import OpenAI

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://alapi.deep.sa/v1"
)

response = client.embeddings.create(
    model="deep-sa/alEmbedding",
    input="The quick brown fox jumps over the lazy dog"
)

embedding = response.data[0].embedding
print(f"Embedding dimension: {len(embedding)}")

Available Models

List Models

Returns the list of currently available models

GET

Endpoint:

endpoint
https://alapi.deep.sa/v1/models

Required Scope: This endpoint requires an API key with the models

Response (200 OK):

response.json
{
  "object": "list",
  "data": [
    {
      "id": "llama-3.3-70b-versatile",
      "object": "model",
      "created": 1706745000,
      "owned_by": "groq",
      "type": "llm"
    },
    {
      "id": "deep-sa/alEmbedding",
      "object": "model",
      "created": 1706745000,
      "owned_by": "openai",
      "type": "embedding"
    }
  ]
}

Code Examples:

main.py
from openai import OpenAI

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://alapi.deep.sa/v1"
)

models = client.models.list()
for model in models.data:
    print(f"{model.id} ({model.type})")

The following models are currently available through alAPI. Use the model name in your API requests.

Model Name Type Provider Avg Latency
deep-sa/alEmbedding embedding deepcloud ~1717ms
deep-sa/alLLM llm deepcloud ~2796ms
google/gemini-3-flash llm google_gemini ~4841ms
google/gemini-3-pro llm google_gemini ~18360ms
gpt-oss-120b llm groq ~4264ms
gpt-oss-20b llm groq ~536ms
llama-3.3-70b llm groq ~194ms
llama-4-maverick-17b llm groq ~791ms
opanai/gpt-5-mini llm openai ~2187ms
qwen3-32b llm groq ~2559ms

Error Handling

The API uses standard HTTP status codes to indicate success or failure of requests.

Error Response Format:

error.json
{
  "error": {
    "message": "Error description",
    "type": "invalid_request_error",
    "code": "invalid_api_key"
  }
}
400 Bad Request - Invalid request parameters
401 Unauthorized - Missing or invalid API key
403 Forbidden - API key lacks required permissions
404 Not Found - Resource doesn't exist
429 Too Many Requests - Rate limit exceeded
500 Internal Server Error - Server-side error

Ready to Get Started?

Generate an API key from your dashboard and start building

Activate account?

Activate this account?