SapixDBSapixDB/Docs
Early Access
SaQL Reference

SaQL — SapixDB Query Language

SaQL is how you ask SapixDB for data. It comes in two tiers: structured queries (explicit JSON — deterministic, fast) and semantic queries (natural language — parsed by a heuristic planner, optionally backed by an LLM).

Tier 1
Semantic / Natural Language
POST /v1/query/semantic

Send plain English. The heuristic planner converts it to a Tier-2 plan. Optionally enhanced by an OpenAI-compatible LLM.

Tier 2
Structured SaQL
POST /v1/query

Explicit JSON query — you specify the type and parameters. No ambiguity, deterministic results, lowest latency.

Structured SaQL (Tier 2)

Send a POST request to /v1/query with a JSON body. Every query has a type field that selects the query shape.

typeRequired fieldsWhat it returns
latestlimit (default 10, max 1000)Return the N most recently written records, newest first.
chain_headReturn the content hash of the current chain tip.
hashcontent_hash (64-hex BLAKE3)Point read — return the single record with this hash.
time_rangefrom_ts, to_ts (u64 HLC)Return all records whose HLC timestamp falls within the range, sorted ascending.
as_oftimestamp_hlc (u64 HLC), limit (default 10)Time travel — return up to limit records written at or before timestamp_hlc, oldest first.

latest — fetch recent records

POST /v1/query
{
  "type": "latest",
  "limit": 20
}
response
{
  "records": [
    {
      "record_id": "a1b2c3d4-...",
      "content_hash": "e7f2a1b3...",
      "parent_hash":  "c4d5e6f7...",
      "timestamp_hlc": 1716400005120000,
      "payload_b64": "gqRuYW1lpUFsaWNlpGFnZRs=",
      "flags": 0
    },
    ...
  ]
}
HLC timestampsTimestamps are stored as Hybrid Logical Clock (HLC) values — 64-bit integers where the upper 44 bits encode milliseconds since the Unix epoch. To convert: Math.floor(hlc / 2**20) in JavaScript or hlc >> 20 in most other languages.

hash — point read by content hash

POST /v1/query
{
  "type": "hash",
  "content_hash": "e7f2a1b3c4d5e6f7a8b9c0d1e2f3a4b5c6d7e8f9a0b1c2d3e4f5a6b7c8d9e0f1"
}

Returns an array with one record if found, or an empty array if not. The content hash is the BLAKE3 hash of the raw MessagePack payload — identical bytes always produce the same hash.

time_range — scan a time window

POST /v1/query
{
  "type": "time_range",
  "from_ts": 1716400000000000,
  "to_ts":   1716400060000000
}

Returns all records written between from_ts and to_ts (inclusive), sorted by timestamp_hlc ascending. Both bounds are HLC values — multiply a Unix millisecond timestamp by 2^20 to convert: ms * 1048576.

as_of — time travel

POST /v1/query
{
  "type": "as_of",
  "timestamp_hlc": 1716400030000000,
  "limit": 50
}

Returns up to limit records written at or before timestamp_hlc, oldest first. This is how you reconstruct the exact state of the strand at any past moment — without any extra tooling, forever.

chain_head — current tip

POST /v1/query
{
  "type": "chain_head"
}

Returns one record containing only content_hash — the BLAKE3 hash of the most recently written nucleotide. Use this to check synchronization between agents or to anchor a verification proof.

Semantic Queries (Tier 1)

The POST /v1/query/semantic endpoint accepts a plain-English string and converts it to a Tier-2 SaQL plan before executing it. The response includes the compiled plan so clients can cache or inspect it.

POST /v1/query/semantic
{
  "query": "show me the last 10 records"
}
response
{
  "source": "heuristic",
  "plan": { "type": "latest", "limit": 10 },
  "records": [ ... ]
}

The source field tells you how the plan was produced:

  • cache — exact same query was seen before; plan reused
  • heuristic — keyword pattern matched; no LLM call
  • model — OpenAI-compatible LLM produced the plan (requires SAPIX_LLM_URL)

Heuristic patterns (no LLM required)

The built-in heuristic planner understands the most common query shapes without any AI inference. These work with or without an LLM configured:

Natural language phraseCompiled SaQL plan
"chain head" / "head"{"type":"chain_head"}
"latest 20" / "last 20" / "recent 20" / "top 20" / "newest 20"{"type":"latest","limit":20}
"recent records" (no number){"type":"latest","limit":10}
"as of 1716400005000" / "at timestamp 1716400005000"{"type":"as_of","timestamp_hlc":1716400005000,"limit":100}
"records between 100 and 200" / "from 100 to 200"{"type":"time_range","from_ts":100,"to_ts":200}
"hash e7f2a1..." (64-char hex after the word){"type":"hash","content_hash":"e7f2a1..."}
Unrecognized queriesIf the heuristic planner cannot confidently interpret a phrase and no LLM is configured, the endpoint returns HTTP 422 Unprocessable Entity with a message suggesting you use the explicit POST /v1/query endpoint instead.

LLM mode (optional)

Set SAPIX_LLM_URL in your environment to enable an OpenAI-compatible language model as a fallback planner. When the heuristic fails, the agent sends the query to the model and validates the output against the SaQL schema — hallucinated structures are rejected.

docker-compose.yml — enable LLM planner
environment:
  SAPIX_LLM_URL: https://api.openai.com     # any OpenAI-compatible endpoint
  SAPIX_LLM_API_KEY: sk-...                 # optional — omit for open endpoints
  SAPIX_LLM_MODEL: gpt-4o-mini              # default if omitted
LLM is optionalSaQL structured queries and the heuristic planner work without any AI dependency. The LLM is only used when a natural-language query doesn't match a heuristic pattern. Production deployments typically handle 95% of queries with the heuristic alone.

Using SaQL from the SDKs

JavaScript / TypeScript

TypeScript
import { SapixClient } from "@sapixdb/sdk";

const db = new SapixClient({ url: "http://localhost:7475" });

// Structured — latest 5
const { records } = await db.query({ type: "latest", limit: 5 });

// Structured — time range
const range = await db.query({
  type: "time_range",
  from_ts: Date.now() * 1048576 - 60_000 * 1048576,  // 1 minute ago
  to_ts:   Date.now() * 1048576,
});

// Semantic — natural language
const result = await db.querySemantic("show me the last 10 records");
console.log(result.source); // "heuristic"
console.log(result.plan);   // { type: "latest", limit: 10 }

Python

Python
from sapixdb_agent import SapixClient

async with SapixClient("http://localhost:7475") as db:
    # Structured — as_of (time travel)
    result = await db.query({
        "type": "as_of",
        "timestamp_hlc": 1_716_400_030_000_000,
        "limit": 50,
    })

    # Semantic
    sem = await db.query_semantic("last 20 records")
    print(sem.source)   # "heuristic"
    print(sem.plan)     # {"type": "latest", "limit": 20}
    print(sem.records)  # list[RecordView]

Go

Go
client := sapixdb.New(sapixdb.Config{URL: "http://localhost:7475"})

// Structured — latest
res, err := client.Query(ctx, map[string]any{
    "type":  "latest",
    "limit": 10,
})

// Structured — hash
res, err = client.Query(ctx, map[string]any{
    "type":         "hash",
    "content_hash": "e7f2a1b3...",
})

// Semantic
sem, err := client.QuerySemantic(ctx, "show me recent records")

Direct HTTP (curl)

terminal — structured
curl -X POST http://localhost:7475/v1/query \
  -H "Content-Type: application/json" \
  -d '{"type":"latest","limit":5}'
terminal — semantic
curl -X POST http://localhost:7475/v1/query/semantic \
  -H "Content-Type: application/json" \
  -d '{"query":"records from the last hour"}'

HLC Timestamp Conversion

SapixDB stores timestamps as Hybrid Logical Clock (HLC) values — a u64 where the upper 44 bits hold milliseconds since the Unix epoch and the lower 20 bits hold a monotonic counter for sub-ms ordering.

Conversion formulas
# Unix ms → HLC
hlc = unix_ms * 1_048_576          # multiply by 2^20

# HLC → Unix ms
unix_ms = hlc >> 20                 # right-shift by 20

# JavaScript
const hlc = Date.now() * (2 ** 20);
const ms  = Math.floor(hlc / (2 ** 20));

# Python
hlc = int(time.time() * 1000) << 20
ms  = hlc >> 20

# Go
hlc := uint64(time.Now().UnixMilli()) << 20
ms  := hlc >> 20
Want to query with natural language from day one?

The heuristic planner covers the common cases with zero config. Add an LLM URL when you're ready for arbitrary query shapes.