Admin Operations
Administrative endpoints require elevated privileges (ADMIN_API_TOKEN) for managing enrichment processing and embedding generation. These operations are intended for maintenance, debugging, and bulk data operations.
For standard memory operations (store, recall, update, delete), see Memory Operations. For consolidation scheduling, see Consolidation Operations.
Authentication Model
Section titled “Authentication Model”Admin operations require dual authentication:
- Standard API Token (
AUTOMEM_API_TOKEN) — Required for all endpoints except/health - Admin Token (
ADMIN_API_TOKEN) — Additional token for privileged operations
Authentication Methods
Section titled “Authentication Methods”| Token Type | Header Methods | Query Parameter | Environment Variable |
|---|---|---|---|
| API Token | Authorization: Bearer <token> / X-API-Key: <token> | ?api_key=<token> | AUTOMEM_API_TOKEN |
| Admin Token | X-Admin-Token: <token> / X-Admin-Api-Key: <token> | ?admin_token=<token> | ADMIN_API_TOKEN |
Endpoint Authentication Requirements
Section titled “Endpoint Authentication Requirements”graph TB
subgraph "Authentication Layers"
Layer1["Layer 1: API Token<br/>AUTOMEM_API_TOKEN<br/>Guards all endpoints except /health"]
Layer2["Layer 2: Admin Token<br/>ADMIN_API_TOKEN<br/>Guards privileged operations"]
end
subgraph "Protected Operations"
Reprocess["/enrichment/reprocess"]
Reembed["/admin/reembed"]
end
subgraph "Also Public"
Health["/health<br/>(no auth)"]
end
subgraph "API Token Required (when AUTOMEM_API_TOKEN configured)"
Status["/enrichment/status"]
end
Layer1-->Layer2
Layer1-->Status
Layer2-->Reprocess
Layer2-->Reembed
Error Responses
Section titled “Error Responses”| Status Code | Response | Meaning |
|---|---|---|
401 Unauthorized | {"error": "Unauthorized"} | Missing or invalid AUTOMEM_API_TOKEN |
401 Admin authorization required | {"error": "Admin authorization required"} | Missing or invalid ADMIN_API_TOKEN |
403 Admin token not configured | {"error": "Admin token not configured"} | Server has no ADMIN_API_TOKEN environment variable set |
GET /enrichment/status
Section titled “GET /enrichment/status”Authentication: API token (when AUTOMEM_API_TOKEN is configured)
Purpose: Monitor the enrichment pipeline’s health and processing statistics. Unlike /health, this endpoint is protected by the global API token guard whenever AUTOMEM_API_TOKEN is set.
Response Schema
Section titled “Response Schema”{ "status": "running", "queue_size": 3, "pending": 2, "inflight": 1, "max_attempts": 3, "stats": {}}Status Values
Section titled “Status Values”| Field | Type | Description |
|---|---|---|
status | string | "running" if enrichment worker is active, "stopped" if worker thread is dead |
queue_size | integer | Current number of jobs in the enrichment queue |
pending | integer | Count of memories waiting to be processed |
inflight | integer | Count of memories currently being processed |
max_attempts | integer | Maximum retry attempts per memory before marking failed (from ENRICHMENT_MAX_ATTEMPTS) |
stats | object | Lifetime enrichment statistics (nested object with processing counters) |
Example Usage
Section titled “Example Usage”curl "https://your-automem-instance/enrichment/status" \ -H "Authorization: Bearer YOUR_API_TOKEN"Troubleshooting with Status
Section titled “Troubleshooting with Status”| Observation | Likely Cause | Action |
|---|---|---|
status: "stopped" | Worker thread crashed | Check application logs for exceptions, restart service |
queue_size increasing | Worker processing slower than intake | Monitor inflight, check for spaCy or OpenAI issues |
High failure count in stats | Enrichment logic errors | Review application logs, check Qdrant connectivity |
inflight stuck | Worker deadlocked | Restart enrichment worker or service |
POST /enrichment/reprocess
Section titled “POST /enrichment/reprocess”Authentication: API token + Admin token
Purpose: Force re-enrichment of specific memories. Useful after:
- Updating enrichment logic or configuration
- Adding spaCy model capabilities
- Fixing corrupted enrichment metadata
- Recovering from systematic enrichment failures
Request Schema
Section titled “Request Schema”| Parameter | Type | Required | Description |
|---|---|---|---|
ids | array[string] | Yes | List of memory UUIDs to reprocess (non-empty) |
Reprocessing always forces re-queuing regardless of current pending/in-flight state.
Example Request
Section titled “Example Request”curl -X POST https://your-automem-instance/enrichment/reprocess \ -H "Authorization: Bearer YOUR_API_TOKEN" \ -H "X-Admin-Token: YOUR_ADMIN_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "ids": [ "a1b2c3d4-e5f6-7890-abcd-ef1234567890", "b2c3d4e5-f6a7-8901-bcde-f12345678901" ] }'Response Schema (HTTP 202 Accepted)
Section titled “Response Schema (HTTP 202 Accepted)”| Field | Type | Description |
|---|---|---|
status | string | Always "queued" |
count | integer | Number of memories successfully added to enrichment queue |
ids | array[string] | List of memory UUIDs that were queued |
{ "status": "queued", "count": 2, "ids": [ "a1b2c3d4-e5f6-7890-abcd-ef1234567890", "b2c3d4e5-f6a7-8901-bcde-f12345678901" ]}Processing Flow
Section titled “Processing Flow”graph TB
Request["POST /enrichment/reprocess<br/>{ids: [...]}"]
Auth["require_admin_token()"]
Validate["Validate Request"]
subgraph "Queueing Loop"
Enqueue["enqueue_enrichment()<br/>with forced=True"]
IncrQueued["queued_count++"]
end
Response["Return 202:<br/>status, count, ids"]
Request-->Auth
Auth-->Validate
Validate-->Enqueue
Enqueue-->IncrQueued
IncrQueued-->Response
Implementation Details
Section titled “Implementation Details”The reprocessing operation performs these steps:
- Validation Phase — Validates that the
idsarray is non-empty - Queueing — Calls
enqueue_enrichment(memory_id, forced=True)which:- Acquires
state.enrichment_lock - Adds memory ID to
state.enrichment_pending - Puts an
EnrichmentJobwithforced=Truein queue
- Acquires
- Background Processing — The
enrichment_worker()thread picks up jobs and callsenrich_memory()which:- Extracts entities via spaCy (if installed)
- Creates temporal
PRECEDED_BYedges - Finds semantic neighbors via Qdrant
- Creates
SIMILAR_TOrelationships - Detects patterns and creates
EXEMPLIFIESedges - Updates
metadata.enriched_attimestamp
POST /admin/reembed
Section titled “POST /admin/reembed”Authentication: API token + Admin token
Purpose: Regenerate embeddings for all memories in batches. Critical for:
- Migrating to a different embedding model
- Recovering from Qdrant data loss
- Fixing corrupted embeddings
- Bulk embedding generation after initial import
Request Schema
Section titled “Request Schema”| Parameter | Type | Required | Description |
|---|---|---|---|
batch_size | integer | No | Embeddings per OpenAI API call. Default: 32. Max recommended: 100 |
limit | integer | No | Max memories to process. If omitted, processes all memories in database |
force | boolean | No | Re-embed memories even if embeddings already exist. Default: false |
Example Request
Section titled “Example Request”curl -X POST https://your-automem-instance/admin/reembed \ -H "Authorization: Bearer YOUR_API_TOKEN" \ -H "X-Admin-Token: YOUR_ADMIN_TOKEN" \ -H "Content-Type: application/json" \ -d '{"batch_size": 32}'Processing Architecture
Section titled “Processing Architecture”graph TB
Request["POST /admin/reembed<br/>{batch_size: 32, limit: 1000}"]
Auth["require_admin_token()"]
Init["Retrieve pre-initialized OpenAI client<br/>get_openai_client()"]
FetchAll["Single FalkorDB query:<br/>MATCH (m:Memory)<br/>[WHERE m.content IS NOT NULL]<br/>RETURN id, content, tags, …<br/>ORDER BY timestamp DESC<br/>[LIMIT limit if set]"]
subgraph "Batch Processing Loop"
Slice["Slice next batch_size memories"]
CallOpenAI["OpenAI API:<br/>embeddings.create()<br/>model=embedding_model<br/>input=[contents]"]
UpdateQdrant["Qdrant: upsert()<br/>PointStruct with full payload<br/>(metadata_preserved=True)"]
IncrCount["processed_count += batch_size"]
end
Response["Return summary:<br/>status, processed, failed,<br/>total, batch_size,<br/>metadata_preserved"]
Request-->Auth
Auth-->Init
Init-->FetchAll
FetchAll-->Slice
Slice-->CallOpenAI
CallOpenAI-->UpdateQdrant
UpdateQdrant-->IncrCount
IncrCount-->|More batches?|Slice
IncrCount-->|Done|Response
Response Schema
Section titled “Response Schema”| Field | Type | Description |
|---|---|---|
status | string | Result status |
processed | integer | Number of memories successfully re-embedded |
failed | integer | Number of memories that failed re-embedding |
total | integer | Total memory count in database at operation start |
batch_size | integer | Batch size used (from request or default 32) |
metadata_preserved | boolean | Whether existing metadata was preserved during re-embedding |
{ "status": "complete", "processed": 1000, "failed": 0, "total": 1000, "batch_size": 32, "metadata_preserved": true}Implementation Details
Section titled “Implementation Details”Phase 1: Memory Enumeration and Content Fetch
Fetches all memory data (or up to limit) in a single FalkorDB query:
MATCH (m:Memory)WHERE m.content IS NOT NULLRETURN m.id AS id, m.content AS content, m.tags AS tags, m.importance AS importance, m.timestamp AS timestamp, m.type AS type, m.confidence AS confidence, m.metadata AS metadata, m.updated_at AS updated_at, m.last_accessed AS last_accessedORDER BY m.timestamp DESC-- When the `limit` request parameter is set, a LIMIT clause is appended after this lineWhen force=true, the WHERE m.content IS NOT NULL filter is omitted from the query. However, the Python collection loop still checks if content: before adding a row to the processing list, so memories with null or empty content are excluded regardless of force. There is no separate per-batch content retrieval step — all memory data is loaded upfront.
Phase 2: OpenAI Embedding Generation
Generates embeddings for the entire batch in a single API call using the configured embedding_model. Dimension is determined by the VECTOR_SIZE environment variable (default 1024).
Phase 3: Qdrant Update
Embeddings are written to Qdrant only. Qdrant failures are logged but don’t halt the operation (graceful degradation). FalkorDB graph data is not modified by this operation.
Performance Considerations
Section titled “Performance Considerations”| Batch Size | OpenAI API Calls (1000 memories) | Approx Time | Cost (1000 memories) |
|---|---|---|---|
| 10 | 100 | ~5 minutes | $0.06 |
| 32 | 32 | ~2 minutes | $0.06 |
| 50 | 20 | ~1 minute | $0.06 |
| 100 | 10 | ~30 seconds | $0.06 |
Recommendations:
- Default 32 balances API call overhead and failure blast radius
- Use 100 for large migrations (>10K memories) with stable OpenAI access
- Use 10 during testing or with rate-limited OpenAI keys
- Monitor
processedcount to detect stalls mid-operation
Error Handling
Section titled “Error Handling”The operation continues even if individual batches fail:
| Error | Cause | Behavior |
|---|---|---|
| OpenAI API rate limit | Exceeded quota | Retries with exponential backoff (handled by OpenAI SDK) |
| Missing memory content | Deleted between enumeration and fetch | Logged, skipped, processing continues |
| Qdrant connection failure | Network issue or Qdrant down | Logged, FalkorDB still updated (graceful degradation) |
| Invalid content format | Null or non-string content | Logged, skipped |
All errors are logged with structured context:
logger.exception("Failed to generate embeddings for batch", extra={"batch_ids": ids})POST /admin/sync
Section titled “POST /admin/sync”Authentication: API token + Admin token
Purpose: Perform non-destructive drift repair between FalkorDB and Qdrant. Detects and reconciles discrepancies without deleting data.
Request Schema
Section titled “Request Schema”| Parameter | Type | Required | Description |
|---|---|---|---|
batch_size | integer | No | Number of memories to process per batch. Default: 32 |
dry_run | boolean | No | If true, report drift without making changes. Default: false |
Example Request
Section titled “Example Request”curl -X POST https://your-automem-instance/admin/sync \ -H "Authorization: Bearer YOUR_API_TOKEN" \ -H "X-Admin-Token: YOUR_ADMIN_TOKEN" \ -H "Content-Type: application/json" \ -d '{"batch_size": 32, "dry_run": false}'Security Model
Section titled “Security Model”Threat Model
Section titled “Threat Model”Admin operations can:
- Force expensive OpenAI API calls (re-embedding entire database)
- Trigger resource-intensive enrichment reprocessing
- Access operational metrics (enrichment statistics)
Without admin token protection, a compromised API token could:
- Generate thousands of dollars in OpenAI costs via repeated re-embedding
- Overload enrichment workers with duplicate jobs
- Enumerate all memory IDs via reprocess endpoint
Best Practices
Section titled “Best Practices”| Practice | Rationale | Implementation |
|---|---|---|
| Separate tokens | Limits blast radius of API token compromise | Use different values for AUTOMEM_API_TOKEN and ADMIN_API_TOKEN |
| Rotate periodically | Reduces window of exposure | Regenerate tokens monthly, update all clients |
| Restrict admin access | Minimize privilege escalation risk | Share admin token only with operations team |
| Use headers, not query params | Prevents token leakage in logs | Prefer Authorization: Bearer and X-Admin-Token headers |
| Monitor admin operations | Detect anomalous usage | Alert on high-frequency /admin/reembed calls |
| Audit admin calls | Forensic capability | Log admin operations with IP, timestamp, token hash |
Operational Patterns
Section titled “Operational Patterns”Enrichment Pipeline Recovery
Section titled “Enrichment Pipeline Recovery”After spaCy model upgrades or enrichment logic changes:
# 1. Get all memory IDs that need reprocessingMEMORY_IDS=$(curl -s "https://your-instance/recall?limit=100&query=*" \ -H "Authorization: Bearer $TOKEN" | jq -r '[.results[].memory.memory_id]')
# 2. Reprocess themcurl -X POST https://your-instance/enrichment/reprocess \ -H "Authorization: Bearer $TOKEN" \ -H "X-Admin-Token: $ADMIN_TOKEN" \ -H "Content-Type: application/json" \ -d "{\"ids\": $MEMORY_IDS}"Embedding Model Migration
Section titled “Embedding Model Migration”When switching from text-embedding-3-small (1024-d, set by VECTOR_SIZE) to text-embedding-3-large (3072-d):
- Update
VECTOR_SIZEenvironment variable - Recreate the Qdrant collection with new dimensions
- Run
/admin/reembedwithbatch_size=50to regenerate all embeddings
Disaster Recovery from Qdrant Loss
Section titled “Disaster Recovery from Qdrant Loss”If Qdrant data is corrupted or lost but FalkorDB is intact:
curl -X POST https://your-automem-instance/admin/reembed \ -H "Authorization: Bearer YOUR_API_TOKEN" \ -H "X-Admin-Token: YOUR_ADMIN_TOKEN" \ -H "Content-Type: application/json" \ -d '{"batch_size": 50}'See Operations / Health for complete recovery procedures.