Get your API key
After completing checkout, NodeGhost will automatically send your API key to the email address you used at signup. The key looks like this:
Keep this key private — it controls access to your plan's request quota. If you lose it, contact monitee@gmail.com and we'll reissue one.
Using the API
NodeGhost is a drop-in replacement for the OpenAI API. Most customers authenticate with an ng- key obtained via Stripe subscription or USDC payment; native POKT stakers use on-chain delegation instead (Path 3 below).
Path 1 — ng- key (Stripe or USDC)
If you signed up via Stripe or paid with USDC, you have an ng- API key. Register your model endpoint once, then use it like any OpenAI-compatible API:
X-Endpoint-Key header. Register your endpoint first at POST /v1/endpoint/register.
Path 2 — Python (ng- key)
Path 3 — Native POKT stake
If you've staked a POKT application directly on Shannon mainnet, see the Stake your app wallet section below for the full flow — different endpoint, different headers, no ng- key required.
Supported model providers
| Provider | X-Endpoint value | Example model |
|---|---|---|
| OpenAI | https://api.openai.com |
gpt-4o-mini |
| DeepSeek | https://api.deepseek.com |
deepseek-chat |
| Groq | https://api.groq.com/openai |
llama-3.1-70b-versatile |
| Anthropic | https://api.anthropic.com |
claude-3-5-haiku-20241022 |
| Self-hosted (Ollama etc.) | https://your-server.com |
llama3.2:3b |
Plans & CU metering
NodeGhost meters usage in Compute Units (CU). Each plan includes a monthly CU pool that resets each month. Different services consume CU at different rates — inference is the most expensive, web search and vector memory are much cheaper.
Plans
| Plan | Monthly CU | API keys | Price |
|---|---|---|---|
| Free | 200,000,000 | 2 | $0/mo |
| Starter | 9,000,000,000 | 5 | $9/mo |
| Pro | 29,000,000,000 | 20 | $29/mo |
| Agent | 99,000,000,000 | Unlimited | $99/mo |
CU per call
| Service | Endpoint | CU per call |
|---|---|---|
| Inference | /v1/chat/completions | 400,000 |
| Web search | /v1/tools/search | 50,000 |
| Vector memory | /v1/memory/* | 25,000 |
FREE 200M CU ≈ 500 INFERENCE CALLS · STARTER 9B ≈ 22,500 · PRO 29B ≈ 72,500 · AGENT 99B ≈ 247,500
Home Assistant overview
NodeGhost works as the AI brain for Home Assistant — giving you a genuinely intelligent voice assistant that runs privately, without sending your conversations to Google, Amazon, or Apple.
The full private stack looks like this:
Local speech-to-text (Whisper)
Converts your voice to text entirely on your device. Nothing leaves your home at this stage.
NodeGhost AI (via POKT)
Your text is routed through the POKT decentralized network to your registered model endpoint. No logging of request content — ever.
Local text-to-speech (Piper)
The AI response is converted back to voice on your device. Your assistant speaks back to you.
Total cost: $9/month on the Starter plan. Smarter than Alexa, more private than everything.
Install the integration
NodeGhost uses the Extended OpenAI Conversation integration available through HACS (Home Assistant Community Store).
Step 1 — Install HACS
If you don't have HACS installed yet, follow the official guide at hacs.xyz. It adds a community add-on store to your Home Assistant instance.
Step 2 — Install Extended OpenAI Conversation
Open HACS in Home Assistant
Go to HACS → Integrations → search for "Extended OpenAI Conversation" → Download.
Restart Home Assistant
After downloading, restart Home Assistant to load the new integration.
Add the integration
Go to Settings → Devices & Services → Add Integration → search for "Extended OpenAI Conversation".
Configure NodeGhost credentials
When prompted, enter the following:
The model name must match your registered endpoint. Pass your model provider's API key in the X-Endpoint-Key header if your integration supports custom headers — otherwise register your endpoint via POST /v1/endpoint/register first.
Set as your conversation agent
Go to Settings → Voice Assistants → select your assistant → set Conversation Agent to "Extended OpenAI Conversation".
Add local voice (Whisper + Piper)
For fully private voice — no cloud at any step — install the local speech add-ons. These run entirely on your Home Assistant hardware.
Install Whisper (speech to text)
Go to Settings → Add-ons → Add-on Store
Search for "Whisper" — install the official "Whisper" add-on by Home Assistant.
Start the add-on and enable on boot
In the add-on settings, toggle "Start on boot" and "Watchdog" then hit Start.
Add the Wyoming integration
Go to Settings → Devices & Services → Add Integration → search "Wyoming Protocol" → configure it pointing to the Whisper add-on.
Install Piper (text to speech)
Repeat the same steps for the "Piper" add-on. Once both are installed, go to Settings → Voice Assistants and set:
Remote access
To use the Home Assistant app and your NodeGhost voice assistant when you're away from home, you need a way to reach your Pi remotely. The free option is a Cloudflare Tunnel.
Option A — Cloudflare Tunnel (free)
Get a free domain or use one you own
Cloudflare Tunnels require a domain managed by Cloudflare. You can transfer an existing domain or register one at cloudflare.com.
Install the Cloudflare Tunnel add-on in HA
In HACS, search for "Cloudflare Tunnel" and install it. Configure it with your Cloudflare token from the Cloudflare Zero Trust dashboard.
Point the HA app at your tunnel URL
In the Home Assistant app settings, set your external URL to your Cloudflare tunnel address (e.g. https://home.yourdomain.com).
Option B — Nabu Casa ($6.50/month)
Nabu Casa provides an easy one-click remote access tunnel. It works alongside NodeGhost — Nabu Casa handles the network tunnel, NodeGhost handles the AI. Go to Settings → Home Assistant Cloud to subscribe.
Run your own AI model through NodeGhost
NodeGhost is a privacy-preserving inference gateway — not a managed model service. Register your own OpenAI-compatible endpoint and route all inference through NodeGhost's infrastructure, giving you complete control over which AI model processes your data while NodeGhost handles auth, rate limiting, and decentralized routing.
This means you can run an open source model like Llama, Mistral, or Qwen on your own server, point NodeGhost at it, and use the same https://nodeghost.ai/v1 endpoint you already know. Your model does the inference. NodeGhost handles everything else.
Why run your own model?
There are several reasons you might want to bring your own endpoint:
- You've fine-tuned a model on proprietary data and need inference for it
- You want complete privacy — no third party sees your prompts at any layer
- You want to use a specific open source model not available through NodeGhost's default backend
- You're building a product on top of your own model and need auth and rate limiting infrastructure
- You want to protect your model endpoint from being exposed publicly
How it works
When you register a custom endpoint, NodeGhost stores it linked to your API key. Every inference request you make is authenticated and rate limited as normal — then forwarded to your endpoint via the POKT network. Your model processes the request and returns the response through NodeGhost back to your application.
Your endpoint is never exposed publicly. All traffic enters through nodeghost.ai/v1 and NodeGhost proxies it to your server privately.
Register your endpoint
Your endpoint must be publicly accessible over HTTPS and OpenAI-compatible. Any server running Ollama, LM Studio, vLLM, or a custom FastAPI wrapper will work.
Step 1 — Run an OpenAI-compatible model server
The most common option is Ollama. Install it on any VPS or server and expose it publicly:
Step 2 — Register your endpoint with NodeGhost
Call the registration endpoint with your ng- API key:
NodeGhost will verify your endpoint is reachable and return a confirmation:
Step 3 — Use NodeGhost as normal
Nothing changes in your application. Keep using https://nodeghost.ai/v1 with your ng- key. NodeGhost silently routes to your registered model endpoint:
Check your routing status
Remove your custom endpoint
To remove your registered endpoint:
Full privacy stack — zero third parties
If complete privacy is your goal — where no third party ever sees the content of your AI requests — this is the architecture that achieves it.
The full stack
Here's every component and who controls it:
What this means for privacy
With this setup your prompts travel from your application to NodeGhost for authentication and routing through the POKT decentralized network, then directly to your own server. The request enters through a gateway with no single point of control — POKT's on-chain verification ensures the routing is transparent and auditable. The model that processes your request runs on hardware you control. No AI company, no cloud provider, no GPU farm ever sees what you're asking.
NodeGhost sees that a request happened — the timestamp and your API key — but never the content. That metadata is used only for rate limiting and is retained for 90 days.
Recommended models for self-hosting
These open source models run well on a modest VPS with 16GB+ RAM:
What is POKT?
POKT is the native token of Pocket Network — the decentralized infrastructure that NodeGhost is built on. Every AI request you make through NodeGhost is routed through the POKT Shannon network, verified on-chain, and settled between gateway operators and node suppliers.
This is what makes NodeGhost fundamentally different from other AI providers — the routing is on-chain and decentralized, so there's no single company that could log, monitor, or sell your requests even if they wanted to.
For most users, POKT is invisible — you just use your ng- API key and pay via Stripe. But if you're crypto-native and want to interact with the network directly, you can stake your own application wallet and pay per relay in POKT tokens.
Stake your app wallet
Advanced users can bypass the Stripe subscription entirely by staking a POKT application wallet directly on Shannon mainnet. This gives you pay-per-relay access at the rate set by each service — for example ai-inference is priced at $0.0004 per relay. Each service has its own relay price, so the cost depends on which service you stake against.
External native staking is currently supported only for the ai-inference service. Other POKT services (web-search, text-generation, vector-memory) are operated as NodeGhost-internal infrastructure and not useful targets for external stakers.
What you'll need
A funded POKT wallet with enough tokens to stake an application. Current minimum is 1,001 POKT. You'll also need pocketd installed on your machine.
Stake your application
Your app-stake-config.yaml should specify the ai-inference service:
Once staked, delegate your application to NodeGhost's gateway address:
Gateway address for reference:
Using the gateway after staking
Once staked and delegated, hit the native POKT endpoint with your model provider's API key as Authorization, your provider URL as X-Endpoint, and your application address as X-App-Address:
Supported providers via X-Endpoint: OpenAI, DeepSeek, Groq, Anthropic, or any OpenAI-compatible self-hosted model. If X-Endpoint is omitted, defaults to https://api.openai.com.
The X-App-Address header
Every request must include your application's POKT address in the X-App-Address header. This tells the gateway which of your delegated applications to claim the relay under.
If you've staked a single application, this is just your one app address on every request. If you've staked multiple applications across different services, include the address matching the service you're calling.
The gateway validates the address (regex ^pokt1[a-z0-9]{38}$) and confirms on-chain that it delegates to the NodeGhost gateway. If missing or malformed, the request returns HTTP 400.
USDC on Base
USDC payments via the Base network are currently available for select customers. We're building out wallet-based payment association so USDC sends from your registered wallet credit automatically — without exposing your API key on-chain.
If you'd like to use USDC today while we finalize this, email monitee@gmail.com and we'll provision your account manually.
Hosted models
NodeGhost offers access to hosted open source models routed through the POKT decentralized network. No external API key required — just your ng- key. These models are served by independent node operators on Shannon mainnet, earning POKT relay rewards for every request.
| Model | Endpoint | Model name | Status |
|---|---|---|---|
| Llama 3.2 1B Instruct | /text-generation/v1/ |
pocket_network |
Live |
Text generation
Access Llama 3.2 1B Instruct via the POKT decentralized network. No model API key needed — requests are handled by independent POKT node operators. Use pocket_network as the model name.
Example request (curl)
curl https://nodeghost.ai/text-generation/v1/chat/completions \
-H "Authorization: Bearer ng-your-key-here" \
-H "Content-Type: application/json" \
-d '{
"model": "pocket_network",
"messages": [{"role": "user", "content": "Hello!"}],
"max_tokens": 200
}'Example request (Python)
from openai import OpenAI
client = OpenAI(
api_key="ng-your-key-here",
base_url="https://nodeghost.ai/text-generation/v1"
)
response = client.chat.completions.create(
model="pocket_network",
messages=[{"role": "user", "content": "Hello!"}],
max_tokens=200
)
print(response.choices[0].message.content)Tools as POKT services
NodeGhost provides AI tools as first-class POKT services. Each tool call is a single relay — flat cost, no tokens consumed, no context window, completely stateless. Tools return structured data only, never raw web content, making them safe and prompt-injection resistant by design.
All tools require a valid ng- API key. Tool calls count against your relay balance — web search costs 1 relay at 50,000 CU, which is cheaper than inference (400,000 CU). This means your relay balance goes further when agents use search than when they run inference.
| Tool | Endpoint | CU per relay | Status |
|---|---|---|---|
| Web search | POST /v1/tools/search | 50,000 | Live |
| Tool discovery | GET /v1/tools | Free | Live |
Web search
Search the web privately via Brave Search. Returns titles, URLs, and snippets. Routed through POKT decentralized network — the search engine never sees the agent's identity or IP address.
POST /v1/tools/search
Authorization: Bearer ng-yourkey
Content-Type: application/json
{
"query": "your search terms",
"count": 5
}
{
"query": "your search terms",
"count": 3,
"results": [
{
"title": "Result title",
"url": "https://example.com",
"snippet": "Brief description of the result..."
}
]
}
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
| query | string | Yes | Search terms, max 400 characters |
| count | integer | No | Number of results, 1–20, default 5 |
Also supports GET requests: GET /v1/tools/search?q=your+query&count=5
Using web search as an agent tool
Web search is designed to be called by AI agents via the OpenAI tool calling format. Pass it as a tool definition in your inference request and the model will call it automatically when it needs current information:
curl https://nodeghost.ai/v1/chat/completions \
-H "Authorization: Bearer ng-yourkey" \
-H "X-Endpoint-Key: sk-yourmodelkey" \
-H "Content-Type: application/json" \
-d '{
"model": "your-model",
"messages": [{"role": "user", "content": "What is the current price of POKT?"}],
"tools": [{
"type": "function",
"function": {
"name": "web_search",
"description": "Search the web for current information",
"parameters": {
"type": "object",
"properties": {
"query": {"type": "string", "description": "Search terms"}
},
"required": ["query"]
}
}
}]
}'Tool discovery
Query this endpoint to discover all available NodeGhost tools, their descriptions, parameters, and endpoints. Designed for autonomous agents that need to discover available capabilities without hardcoded configuration.
GET /v1/tools
{
"provider": "NodeGhost",
"description": "Privacy-preserving AI tools routed through POKT",
"cost": "1 POKT relay per tool call = 400,000 CU",
"tools": [
{
"name": "web-search",
"description": "Search the web privately via Brave Search...",
"endpoint": "https://nodeghost.ai/v1/tools/search",
"method": "POST",
"parameters": { ... },
"privacy": "Queries are never logged or stored."
}
]
}
No authentication required for tool discovery. Authentication is required to call individual tools.
Frequently asked questions
https://nodeghost.ai/v1 with your ng- API key, and set the model name to match your registered endpoint. Register your model endpoint first at POST /v1/endpoint/register. See the Home Assistant setup guide above for step-by-step instructions.ai-inference service, delegate to the NodeGhost gateway address, and hit https://nodeghost.ai/pokt/v1/chat/completions with your own model provider API key (in Authorization) and your application address (in X-App-Address). See the Stake your app wallet section above for full instructions./pokt/v1/chat/completions). X-Endpoint tells NodeGhost which model provider to forward your request to — for example https://api.openai.com, https://api.deepseek.com, or https://api.groq.com/openai (defaults to OpenAI if omitted). X-App-Address is your staked POKT application's bech32 address (pokt1...) — it tells the gateway which of your delegated applications to claim the relay under, and is required as of 2026-05-17. NodeGhost never holds your model API key — it's passed directly through to your provider.Ghost Chat
Ghost is NodeGhost's built-in private AI chat interface available at nodeghost.ai/ghost. It combines all three NodeGhost POKT services in one page — inference, web search, and vector memory — routing every request through the decentralized network.
vector-memory), web search (web-search), and inference (ai-inference) — all earning relay rewards.
To use Ghost, open the page and enter your ng- key, model, and base URL. These are saved in your browser's localStorage and persist across sessions.
Memory & encryption
nodeghost.ai/ghost, and only when you opt in to encryption on first use. It does not apply to the hosted knowledge base, the ng-memory npm package, the browser extension, or self-hosted ng-memory-server instances. See the comparison below for each surface's storage model.
When you opt in on the Ghost chat page, Ghost automatically remembers facts from your conversations and recalls them in future sessions. Personal memories are encrypted in your browser before being sent to the memory server — NodeGhost stores ciphertext only, and your password never leaves your browser.
Encryption is optional. If you choose “Skip encryption” on first use, no personal memories are stored at all — the page operates as a stateless chat interface with no recall.
How it works (Ghost chat only)
Create a memory password
On first use, Ghost prompts you to create a password. This password never leaves your browser — it's used to derive an AES-256-GCM encryption key using PBKDF2 (310,000 iterations, SHA-256).
Facts extracted client-side
After each response, your browser calls the model to extract key facts from the conversation. The fact-extraction prompt is sent to the inference endpoint, but the resulting facts are processed locally before storage.
Encrypted before storage
Each extracted fact is encrypted in your browser using your password-derived key (AES-256-GCM with a random per-fact IV). Only the base64-packed ciphertext is sent to the memory server. If you inspect the database, you'll only see encrypted blobs.
Decrypted locally on recall
When you send a new message, Ghost queries the memory server, receives the ciphertext blobs, and decrypts them locally in your browser. The plaintext context is injected into the system prompt of your inference call — the server never sees the decrypted text.
Memory feed indicators
| Indicator | Meaning |
|---|---|
◈ X results from knowledge base | Public business knowledge was recalled (plaintext, not encrypted) |
◈ X results from personal memory | Your encrypted personal memories were recalled and decrypted locally |
◈ X results from knowledge base + personal memory | Both sources contributed context |
🔒 Encrypted badge in header | Encryption is active for this session |
Memory surfaces compared
NodeGhost exposes several memory surfaces with different storage and trust models. Read this before choosing one.
| Surface | Storage | Encrypted? |
|---|---|---|
Ghost chat — personal memorynodeghost.ai/ghost |
Ciphertext on operator VPS, sent via the vector-memory POKT service. Decrypted only in your browser. |
Yes (AES-256-GCM, opt-in) |
Hosted knowledge basePOST /v1/memory/upload |
Plaintext SQLite on operator VPS, per-org isolated. Operator has technical access to uploaded documents. | No |
ng-memory npm package |
Local SQLite file in Node, or browser IndexedDB. Memory data never leaves your device by default. | No (storage is local-only) |
| NodeGhost browser extension | chrome.storage.local in your browser. Memory data never leaves your device, but conversation turns are sent to the inference endpoint for fact extraction. |
No |
| Self-hosted ng-memory-server | Same ng-memory-server software as the hosted KB, run on your own infrastructure. Plaintext SQLite. You control access. |
No |
If you need encrypted memory storage with a NodeGhost-hosted backend today, the Ghost chat page is the only path. Other surfaces store plaintext — pick the one whose threat model fits your use case (your own server, your own machine, or operator-trusted).
Knowledge base
In addition to personal encrypted memories, Ghost automatically queries a shared knowledge-base namespace on every message. This namespace is public and readable by anyone with a valid ng- key — it's designed for business content like product docs, FAQs, and support guides.
Upload documents to the knowledge base using the Memory Admin Panel. The knowledge base is queried through POKT's vector-memory service — the same decentralized routing as personal memories.
ng-memory-server
ng-memory-server is a self-hosted RAG (Retrieval Augmented Generation) server that stores and retrieves memories using vector embeddings. It's the backend that powers Ghost's memory system and the knowledge base.
NodeGhost does not host memory servers — you run your own. This is a privacy decision: your memories never touch NodeGhost infrastructure. The memory server is accessed through POKT's vector-memory service.
Features
| Feature | Details |
|---|---|
| Document ingestion | PDF, TXT, MD — auto-chunked and embedded |
| Vector search | Cosine similarity using local embeddings |
| Multi-tenant | Per-user namespace isolation via hashed ng- key |
| Zero-knowledge | Stores ciphertext — server never sees plaintext when using Ghost encryption |
| Public namespaces | knowledge-base namespace is publicly queryable |
Deploy ng-memory-server
Option 1 — Docker (recommended)
Option 2 — Node.js directly
Environment variables
| Variable | Default | Description |
|---|---|---|
PORT | 3100 | Port to listen on |
DATA_DIR | ./data | Path to store SQLite databases |
INFERENCE_URL | https://nodeghost.ai/v1 | Inference endpoint for fact extraction |
PUBLIC_NAMESPACES | knowledge-base | Comma-separated namespaces accessible without auth |
Memory Admin Panel
The admin panel is a web-based interface for managing your memory server. No SSH required — upload documents, manage sources, browse memories, and view namespaces from any browser.
Access
The admin panel is served by the memory server itself at /admin. If you're using NodeGhost's hosted memory API proxy:
Log in with:
| Field | Value |
|---|---|
| Server URL | https://nodeghost.ai/memory-api (or your own server URL) |
| API Key | Your ng- key |
Uploading documents
Select namespace
Choose knowledge-base for public business content, or default for private personal content.
Drag and drop files
Supports PDF, TXT, and MD files. Multiple files can be uploaded at once. Files are automatically chunked and embedded.
Verify in Sources tab
After upload, switch to the Sources tab to confirm your documents are ingested. Each source shows the number of chunks stored.
Memory API reference
All authenticated endpoints accept your ng- key in the JSON body as owner_token. The body field is the canonical form because the POKT relay path strips Authorization headers — if you're calling through nodeghost.ai/v1/memory/*, put the key in owner_token. Authorization: Bearer still works for direct memory-server calls but not via the relay.
POST /v1/memory/public/recall is the only unauthenticated endpoint — it reads the global public knowledge base and accepts no token.
Every memory operation costs 25,000 CU — much cheaper than inference (400,000 CU).
Recall personal memories
Store a memory
Public knowledge-base recall (no auth)
Upload a document
Multipart form-data does not survive the POKT relay path — the relay miner strips the multipart Content-Type boundary, so multipart uploads fail. Use base64-encoded JSON instead:
Supported types: PDF, TXT, MD. Files are auto-chunked and embedded.
Browse stored memories
Clear all memories
Deletes every chunk across all namespaces for your account. Cannot be undone.
Business Proxy
The NodeGhost business proxy lets you add AI to your website without exposing your ng- key to customers. Your customers chat naturally — no API keys, no setup, no friction. The proxy runs on your server and handles everything.
How it works
The proxy automatically queries your knowledge base before every customer message, injects the relevant context, and forwards to NodeGhost. Point any OpenAI-compatible chat widget at your proxy URL.
Deploy the proxy
Option 1 — Docker (recommended)
Option 2 — Node.js directly
Point your chat widget at the proxy
Proxy configuration
| Variable | Required | Default | Description |
|---|---|---|---|
NG_KEY | ✅ Yes | — | Your NodeGhost API key |
PORT | No | 3200 | Port to listen on |
NG_URL | No | https://nodeghost.ai | NodeGhost base URL |
MEMORY_URL | No | — | Your ng-memory-server URL |
MODEL | No | deepseek-chat | Default model |
NAMESPACE | No | knowledge-base | Memory namespace to query |
SYSTEM_PROMPT | No | Generic prompt | Your custom system prompt |
ALLOWED_ORIGINS | No | * | Comma-separated CORS origins |
MAX_TOKENS | No | 800 | Max tokens per response |
Health check
NodeGhost Memory Extension
The NodeGhost browser extension adds persistent memory to any webpage that uses the NodeGhost API. It intercepts requests to nodeghost.ai/v1/chat/completions, stores facts from conversations, and recalls relevant context automatically.
What it does
| Feature | Details |
|---|---|
| Auto-recall | Injects relevant memories into every chat request automatically |
| Auto-store | Extracts and stores key facts after each response |
| Keyword search | Fast keyword-based recall using chrome.storage |
| Works on nodeghost.ai | Active on the Ghost chat page and any page using the NodeGhost API |
| Privacy-first | Memories stored locally in your browser — never sent anywhere without your request |
Install the extension
The extension is currently available as a developer install (Chrome/Edge). A Chrome Web Store listing is coming soon.
Download the extension
Clone or download ng-kit from github.com/Monitee/ng-kit and find the ng-extension/ folder.
Open Chrome extensions
Navigate to chrome://extensions in your browser and enable Developer Mode (top right toggle).
Load unpacked
Click "Load unpacked" and select the extension/ folder from the repo. The NodeGhost ghost icon will appear in your toolbar.
Enter your ng- key
Click the extension icon and enter your ng- key. The extension will start capturing and recalling memories automatically on nodeghost.ai.
nodeghost.ai only due to CORS restrictions. For use on your own domain, use the business proxy which handles memory server-side.