Tokenist.LLM Usage Management

Track every LLM user. Enforce limits directly in your code.

Tokenist is a Node.js module + REST API that gives you per-user metering, guardrails, and AI-based intent detection. Spot ToS breaches or jailbreak attempts, then block or rate-limit in the same workflow.

See how it works

Free up to 10M tokens/month · No credit card required

Tokenist SDK flow

1.Import the @tokenist/guardrails Node module or call the REST endpoints from any backend.
2.Attach user identity (userId, orgId, feature) before you send a request to your LLM provider.
3.Call tokenist.check() to see if the user is allowed, then tokenist.record() after you get a response.
4.Use intent labels (frustration, jailbreak, ToS breach, etc.) to trigger automatic blocks or throttle logic.

Works with Chat Completions, Responses, Realtime, embeddings — wherever you make LLM calls.

How it works

Drop-in SDK + REST guardrails for your LLM stack

Tokenist plugs into your server-side code. You keep calling OpenAI (or any other provider) exactly as you do today — we just wrap each request with usage tracking, policy checks, and intent detection.

Node-first, API everywhere — Use the TypeScript client when you're on Node, or hit the REST endpoints from any language/runtime.
Per-user accounting — Usage, cost, and rules scoped to userId, orgId, or feature. Perfect for tiered plans and chargebacks.
Intent-aware guardrails — Tokenist uses GPT-4o-mini to label every conversation — jailbreak attempt, ToS breach, frustration, win. Hook those signals into blocklists or custom automation.
One rules engine — Set limits, send alerts, or block users from a single dashboard. No more scattered scripts or cron jobs.

Features

The guardrails layer for every AI product

No proxies, no infra surgery. Just import the SDK (or call the REST API) and start tracking.

🧩

Node SDK + REST

Call `tokenist.check()` before your LLM request and `tokenist.record()` afterward. Works in Node via the TypeScript client or any stack via REST.

📊

Per-user cost tracking

Get live token + cost data per user, org, and feature across every OpenAI model you use. Perfect for tiered plans, internal chargebacks, and budgeting.

🧠

Intent-aware guardrails

GPT-4o-mini labels every conversation (jailbreak, ToS breach, frustration, win). Use those signals to trigger automatic blocks, extra auth, or throttling.

⚙️

Limits & automation

Create cost or token rules per user/tier, send Slack/webhook alerts, or auto-upgrade plans when someone hits a threshold.

🏷️

Feature-level attribution

Tag each request with `feature` and see which product surfaces burn budget. Route expensive workflows to cheaper models before the bill arrives.

🚫

Block & throttle users

Add anyone to the blocklist (with optional expiry) or impose rolling 24h limits. Perfect for dealing with policy violators without killing the whole app.

📈

Dashboards + API parity

Everything you can do in the dashboard (usage, limits, alerts) is available through the API. Automate whatever you don’t want to click through manually.

🗃️

Audit-ready logs

Store request/response metadata for compliance and debugging. Opt-in to full payload logging when you need it, keep it lean when you don’t.

Integration

Use the SDK or hit the REST endpoints

Works with any backend stack. Wrap your OpenAI calls with Tokenist and you get metering, guardrails, and intent detection instantly.

Node / TypeScript

import { tokenist } from '@tokenist/guardrails';

tokenist.init({ apiKey: process.env.TOKENIST_API_KEY });

// 1. Before you call OpenAI
await tokenist.check({
  userId: 'user_abc123',
  feature: 'voice-assistant',
  model: 'gpt-4o-mini',
});

// 2. Call your provider as usual
const completion = await openai.responses.create({ ... });

// 3. Record usage + metadata (Tokenist handles cost calc)
await tokenist.record({
  userId: 'user_abc123',
  feature: 'voice-assistant',
  usage: completion.usage,
  sentiment: completion.response_text,
});

REST (any backend)

curl https://api.tokenist.dev/sdk/check   -H "Authorization: Bearer ug_..."   -H "Content-Type: application/json"   -d '{
    "userId": "user_abc123",
    "feature": "support-bot",
    "model": "gpt-4o",
    "requestType": "chat"
  }'

# Response: { "allowed": true, "labels": ["safe"] }
# Use /sdk/record to log usage, /sdk/log for full payloads.

Need Python, Go, or another runtime? Hit the REST endpoints directly — same rules engine, same intent labels, same dashboard visibility.

Pricing

Simple pricing, based on requests

Pay for requests tracked. Start free, no credit card needed. We'll alert you before you hit your limit.

Free

Side projects and early evaluation

25,000 requests / mo

No credit card required

Full enforcement — rate limits & blocklist
AI intent labels (jailbreak, ToS breach)
Basic dashboard + CSV export
Community support
No credit card required

Starter

Early commercial apps going to market

$49/mo

150,000 requests / mo

Then: $0.50 per 1,000 extra requests

Everything in Free
Email & Slack threshold alerts
Per-feature cost attribution
30-day log retention
Email support

Growth

Growing products with real traffic

$199/mo

750,000 requests / mo

Then: $0.30 per 1,000 extra requests

Everything in Starter
Webhook automations (e.g. auto-block on intent label)
Cohort analysis + user segmentation
90-day log retention
Priority support

Enterprise

1M+ requests / mo

Custom request volume, extended data retention, SSO + RBAC, dedicated SLA, and custom integrations — including on-prem options.

SSO + RBACDedicated SLACustom retentionOn-prem optionDedicated Slack channel

All paid plans include a 14-day free trial · Cancel any time · No credit card required to start

FAQ

Common questions

Everything you need to know before getting started.

No. Tokenist sits in your server-side code. You keep using the OpenAI SDK (or any HTTP client) exactly as you do today. You just wrap requests with `tokenist.check()` and `tokenist.record()` (or call the REST endpoints) so we can track usage and apply guardrails.