Token Intelligence (Atlas)
Token Intelligence, powered by Atlas, is a local-first code knowledge graph system that runs entirely on your machine. It extracts semantic information from your codebase once, then provides that structured data to agents instead of requiring raw file reads—dramatically reducing token usage and improving response quality.What is Atlas?
Atlas is a semantic code intelligence engine that:- Builds a local knowledge graph from your entire codebase
- Extracts code structure (functions, classes, variables, imports, calls, types)
- Models relationships between symbols (calls, references, extensions, implementations)
- Stores everything in an SQLite database at
.atlas/atlas.db - Runs 100% locally with no cloud processing or data transmission
- Exposes a Model Context Protocol (MCP) server so agents get surgical context
- Structured node/edge metadata (start/end line, visibility, type info)
- Semantic search results ranked by relevance
- Call graphs, type hierarchies, and data flow paths
- Per-symbol code snippets instead of whole files
How Indexing Works
Initial Indexing
When you enable Token Intelligence for a project, Tempest:- Spawns
node .../atlas/dist/mcp/server-entry.js --init --path <project> - Initializes the
.atlas/directory structure - Creates an empty SQLite database at
.atlas/atlas.db - Begins scanning the project root
- Discovers all source files matching supported extensions
- Applies
.gitignorerules and default ignore patterns - Respects custom exclusions in
atlas.json(project root) - Skips vendor directories, build output, caches, test resources, Android resource directories, and other generated content by default
- All recognized source code files (
.ts,.js,.py,.go,.rs,.java,.cpp,.cs,.php,.rb,.swift,.kt,.dart,.lua,.vue,.svelte,.astro, and 20+ more) - Configuration files (
.json,.yaml,.toml,.xml) that define routes or dependencies - Framework files (routes, modules, decorators, resolvers)
node_modules/,.venv/,venv/,target/,vendor/,dist/,build/, and ~30 other dependency/build directories- Android resource directories (
res/layout,res/values,res/drawable, etc.) - Test/spec files (unless explicitly included in the query)
- Generated code (detected by heuristics:
@generated,autogenerated,do not editmarkers) - Files larger than 1 MB (minified bundles, compiled assets)
Parsing & Extraction
Atlas uses tree-sitter (with WebAssembly grammars) to parse source files in parallel:- Worker threads pool parses multiple files concurrently
- Each parser extracts nodes (symbols) and edges (relationships)
-
Extracted data includes:
- Node kind (function, class, variable, import, route, component, etc.)
- Name and fully qualified name (e.g.,
src/utils.ts::MathHelper.calculateTotal) - Location (file, start/end line, start/end column)
- Metadata (visibility, type parameters, return type, docstring, signature)
- Decorators and modifiers (async, static, abstract, exported)
-
Edge kinds tracked:
contains(file contains class, class contains method)calls(function calls another)imports(file imports from another)exports(symbol exported from file)extends/implements(inheritance)references(generic reference)type_of(variable has type)returns(function return type)instantiates,overrides,decorates
- Typical scan: 100-1000 files/second (I/O bound)
- Typical parse: 1000-10000 files/second (CPU bound, parallelized)
- A 10k-file project typically indexes in 5-15 minutes on first run
- Database size: ~50-500 MB depending on complexity (heavily indexed repos on the high end)
phase: ‘scanning’ | ‘parsing’ | ‘storing’ | ‘resolving’current: Files processed so fartotal: Total files to processcurrentFile: (optional) Current file being parsed
Incremental Sync
After indexing, Atlas can sync with file changes:atlas.sync()checks disk for added/modified/removed files- Only re-parses changed files (fast path)
- Re-indexes only affected references
- Typical sync: < 1 second on small changes
atlas_explore tool.
Reference Resolution
After extraction, Atlas performs multi-pass reference resolution:- Import-based resolution: Follows
import X from './file'to map names to definitions - Framework-specific: React Routes, Express handlers, NestJS controllers, Laravel middleware, etc.
- Name-based matching: Falls back to symbol-name lookup in the same package
- Type hierarchy traversal: Finds inherited members through extends/implements chains
- Chained calls via conformance: Resolves method calls on protocol/interface implementations
- “Who calls this function?”
- “What does this class extend?”
- “Which routes are handled by this controller?”
- “What symbols are exported from this module?”
The AtlasIndexToast
The AtlasIndexToast is a React component shown in Tempest’s UI during initial indexing:- Polls every 2 seconds for the existence of
.atlas/atlas.dbin the project - Shows a spinner and “Indexing project” message while indexing runs
- Once the database file appears (indexing complete), displays “Index ready” and auto-dismisses after 2.5 seconds
- Users can manually dismiss at any time
server-entry.js --init) detached from the Tempest process. The toast doesn’t wait for the process to exit; it just watches for the database file to materialize, which happens early in the indexing run. The Tauri backend also streams stdout/stderr from the indexing process as atlas:log events so users see real-time progress in the logs panel.
How Atlas Reduces Token Usage
Raw file-based context sends entire files to the LLM:- “Show me src/handlers.ts” → 500+ lines → 2000+ tokens per file
- “Show me 10 related files” → 20k tokens before any actual reasoning
- Agents must parse file structure themselves, extract only relevant pieces
- Duplicated context when multiple symbols from the same file are relevant
-
atlas_explore "How does authentication work?"→ Returns a focused subgraph with:- Only relevant files (5-8 instead of 20+)
- Per-symbol code snippets (20-50 lines) instead of whole files
- Symbol names, signatures, and locations
- Relationships between symbols (who calls who, what implements what)
- ~300-800 tokens instead of 2000+
-
atlas_node "Symbol/QualifiedName"→ Returns just that symbol’s definition + immediate context -
atlas_graph "find_callers UserService.authenticate"→ Returns call chain as a traversable graph -
atlas_search "authentication"→ Returns FTS-ranked search results, top N only
- Token savings: 60-80% reduction in context tokens for typical agent queries
- Faster responses: Smaller context means faster inference
- Better accuracy: Agents reason about structure, not raw text parsing
- Cross-file awareness: Agents see relationships without reading every file
Data Storage
All Atlas data lives in the project at.atlas/:
Database Schema
The SQLite schema (inschema.sql) defines:
Nodes table:
id: Primary key (hash of file path + qualified name)kind: Node type (function, class, import, route, etc.)name: Simple name (e.g., “calculateTotal”)qualified_name: Full path (e.g., “MathHelper.calculateTotal”)file_path: Relative to project rootlanguage: Detected languagestart_line,end_line,start_column,end_column: Locationdocstring,signature: Documentation and type infovisibility,is_exported,is_async,is_static: Modifiersdecorators,type_parameters,return_type: Extra metadataupdated_at: Last modified timestamp
source,target: Node IDskind: Edge type (calls, imports, extends, etc.)metadata: JSON with context (line, column, parameter info)- Unique index on (source, target, kind, line, col) to prevent duplicates
path: File path (primary key)content_hash: SHA256 of file contentslanguage,size: File metadatamodified_at,indexed_at: Timestampsnode_count: Count of extracted symbolserrors: JSON array of parse errors
- Tracks references waiting for resolution
- Clears after successful resolution pass
nodes_ftsvirtual table indexes name, qualified_name, docstring, signature- Enables semantic search across the graph
- Indexes on
kind,name,qualified_name,file_path,language - Composite indexes on
(file_path, start_line),(source, kind),(target, kind)for fast traversal - UNIQUE index on edge identity to prevent duplicates
Database Configuration
Atlas uses SQLite in WAL (Write-Ahead Log) mode:- Readers never block on a concurrent writer
- Writers don’t block readers
- Multiple processes can connect simultaneously (MCP daemon + git hooks)
journal_mode = WAL: Write-ahead loggingsynchronous = NORMAL: Safe with WALbusy_timeout = 5s: Wait up to 5 seconds if database is lockedcache_size = 64 MB: Large page cache for fast queriesmmap_size = 256 MB: Memory-mapped I/O for sequential scans
- Small project (< 1k files): 10-50 MB
- Medium project (1k-10k files): 50-200 MB
- Large project (10k-100k files): 200-500 MB
- Very large (100k+ files): 500 MB-2+ GB
Per-Project Indexing
Each project has its own.atlas/ directory and database:
- Switching to a new project workspace automatically points Atlas to its
.atlas/ - Multiple projects can be indexed simultaneously in separate processes
- The MCP daemon (when running) maintains one connection per project
- No shared index across projects
- Each workspace root needs its own index
- Use
atlas.jsonat the workspace root to configure exclusions/extensions
Re-indexing
Indexing happens in three scenarios:Automatic (on first enable)
When you enable Token Intelligence in Tempest:- Tempest detects no
.atlas/directory - Spawns
atlas init --path <project> - Wauri AtlasIndexToast polls for
.atlas/atlas.db - Background indexing runs, user is notified via toast
Automatic (on file changes)
The file watcher installed by Atlas syncs automatically on detected changes:- Debounced every 500ms
- Only re-parses changed files
- Happens in background without blocking the UI
Manual (via CLI or API)
When to Re-index
Re-index when:- Enabling Token Intelligence for the first time
- After major framework/dependency updates (npm install, pip install)
- After local branch changes that rewrote history
- Atlas recommends it (run
atlas status --jsonto check) - You see “database is locked” errors (indicates corruption)
- Normal code edits (auto-sync handles these)
- Switching branches with similar structure
- Temporary file changes
Supported Languages
Atlas extracts structure from 30+ languages: Web & scripting:- TypeScript, JavaScript, TSX, JSX
- Vue.js, Svelte, Astro
- Python, Ruby, PHP, Lua, Luau
- Go, Rust, C, C++, C#
- Java, Kotlin, Scala
- Swift, Objective-C
- Dart
- YAML, XML, JSON, Properties files
- Liquid (Jekyll templates)
- Twig (Symfony templates)
- Pascal
atlas.json to add custom mappings:
Configuration
Createatlas.json at your project root to configure indexing:
.gitignore but should be indexed anyway. Useful for vendored libraries you want in the graph.
exclude:
Gitignore-style patterns for files to skip, even if tracked in git. Useful for checked-in themes or SDKs that bloat the graph.
Both fields accept gitignore patterns: vendor/, **/*.min.js, src/generated/**, etc.
Graph Operations
Once indexed, agents can query the graph via MCP tools: Search:Performance Considerations
Indexing speed:- Typical: 100-1000 files/second
- Parallelized across worker threads
- Bottleneck: tree-sitter parsing, not I/O or database writes
- Symbol lookup: < 1ms (indexed by name)
- Full-text search: 10-100ms depending on query selectivity
- Graph traversal: 10-500ms depending on depth
- File dependencies: < 100ms
- Parsing workers: ~100 MB each (WASM heap grows during parsing, shrinks after)
- Database: 50-200 MB resident (SQLite page cache)
- CLI tool: 200-500 MB during indexing
- Index database: 50 MB - 2 GB depending on project size
- WAL files: 0-500 MB (temporary during concurrent writes, cleaned after sync)
Troubleshooting
“Database is locked” errors:- Check if another process is indexing (look for
atlasprocesses) - Ensure journal mode is WAL:
sqlite3 .atlas/atlas.db "PRAGMA journal_mode;" - If corrupted, remove
.atlas/and re-index
- Check for very large files (> 1 MB) that slow parsing
- Verify
.gitignoreis working (should exclude vendor dirs) - Look for symbolic links to huge directories
- Use
verbose: trueto see per-file progress
- Run
atlas index --path <project>to rebuild with latest extraction engine
- Check file language detection (
atlas status --jsonshows language distribution) - Verify file is not in excluded patterns
- Ensure file is under project root
- Set
ATLAS_NO_DAEMON=1to run in direct mode - Check
.atlas/daemon.logfor errors - Ensure no stale daemon lock at
.atlas/daemon.lock
What’s Next: Nexus
Tempest includes a Nexus page (code graph visualization interface) that is currently a placeholder for future functionality. Planned features include:- Interactive visualization of the code graph
- Node/edge filtering and search
- Call graph exploration UI
- Data flow visualization
- Type hierarchy browser