Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.searchable.com/llms.txt

Use this file to discover all available pages before exploring further.

What this is

When your hosting platform doesn’t have a first-class connector (or you just want full control), Searchable accepts events directly from your application via two paths:

Middleware SDK

Drop-in Node.js package — @searchablehq/middleware. Wraps Next.js middleware with one line. The primitives also compose into Express, Fastify, or anything else on Node.

REST API

Direct HTTP POST to Searchable’s ingest endpoint. Use from any language — Go, Python, Ruby, PHP, a shell script, or a CDN worker we don’t yet support natively.
Both paths land in the same place. They share the same auth model (workspace API key + project site token) and feed the same dashboards.
No data is mocked. Events you POST flow through the same pipeline as our native Vercel, Cloudflare, and Netlify connectors, and the same server-side bot classifier filters non-AI user agents.

Which one should I use?

Use the middleware SDK. One import, one config object, done. It plugs into middleware.ts and captures every non-static request automatically.
Use the middleware SDK’s primitives. The package exports buildEventPayload and sendEvent so you can wire it into any Node framework — typically as a request-completion hook or res.on("finish") listener.
Use the REST API. It’s a single POST /v1/events call with Authorization: Bearer sk_live_… and a JSON body — language-agnostic, no SDK needed.
Start with a curl against the REST API using a fake AI user agent (e.g. GPTBot/1.0). The status strip in LLM Analytics → Setup flips to Connected within a few seconds. Once that’s green, switch to the middleware SDK or your real integration.
Use the REST API from your worker. Cloudflare’s native /v1/cloudflare-logs endpoint is the same shape — if you’re forwarding NDJSON request logs from a Cloudflare-like worker, point them at /v1/cloudflare-logs instead and reuse the Cloudflare Worker setup.

Common prerequisites

Both paths need the same two credentials.
1

Site token (per project)

  1. Open your Searchable dashboard
  2. Go to LLM Analytics → Setup → Confirm your domain
  3. The private Site token for this project starts with st_… — copy it
The site token identifies which project events belong to. It’s tied to the project’s primary domain.
Searchable also issues a public site token (pst_…) for the browser beacon. That one is not the right one for server-side integrations — use the private st_… token here.
2

Workspace API key

  1. In the same Setup page, click Custom as your crawler source
  2. Open the connector dialog → Generate API key
  3. Searchable creates a key with the log_events permission, scoped to the current project, and shows it once — copy it now
Keys start with sk_live_…. They’re signed JWT-style tokens — the worker verifies them at Cloudflare’s edge with no DB round-trip, so they don’t add per-request latency.
The API key is only shown once. If you lose it, revoke and regenerate from Settings → API Keys or the Custom connector dialog.

What ends up in Searchable

For each event you send, Searchable records:
  • HTTP method, path, host (query strings are stripped before storage)
  • Status code, response time, response bytes
  • User agent (used to classify the AI bot)
  • Referer and referrer domain
  • UTM parameters
  • Geo country (from the inbound IP, if available)
  • Timestamp
Cookies, request/response bodies, and full IP addresses are never stored. The middleware SDK anonymizes IPs by default (zeroes the last octet) before sending. The server-side classifier drops any user agent that isn’t a known AI crawler — so even if you POST every request from your app, the dashboard only shows GPTBot, ClaudeBot, PerplexityBot, and the rest of the AI-bot universe.

How auth flows

Your app ──POST──→ searchable-tracker.workers.dev

                         │  Authorization: Bearer sk_live_…
                         │  Body: { site_token: "st_…", events: [...] }

                   Verify HMAC signature on sk_live_… at the edge
                   (no DB lookup — fast fail on 401 / 403)


                   Resolve site_token → workspace + domain


                   Forward to ingest → ClickHouse
sk_live_* carries { workspace_id, key_id } in its signed payload. The worker verifies the HMAC at Cloudflare’s edge and tags every forwarded event with both values, so revoking a key in the dashboard immediately stops any in-flight POSTs that use it.

Pick a path

Middleware SDK setup

Step-by-step for Next.js + a recipe for Express/Fastify.

REST API reference

Endpoint, headers, payload schema, and a curl you can run today.

Next steps

See the data

Open LLM Analytics to see which assistants are crawling your site.

Add Search Console

Layer in keyword data so you can correlate AI crawls with search demand.