# deet - Complete Project Documentation Generated: 2026-02-26T09:31:53.146Z ================================================================================ TABLE OF CONTENTS ================================================================================ 1. OVERVIEW 2. CORE CONCEPTS 3. LANGUAGE SPECIFICATION (.dt) 4. TYPE SYSTEM 5. OPERATORS AND EXPRESSIONS 6. IR AND GRAPH MODEL 7. BACKENDS AND SQL EMISSION 8. STANDARD LIBRARY 9. POLICY PROFILES AND GUARDRAILS 10. CLI COMMANDS 11. TESTING FRAMEWORK 12. LSP AND EDITOR INTEGRATION 13. CANONICAL ARTIFACTS 14. BUILD AND DEPLOYMENT 15. PROJECT CONTEXT (.deet) 16. ERROR HANDLING AND DIAGNOSTICS 17. FILE STRUCTURE 18. CODE EXAMPLES 19. API REFERENCE 20. WORKSPACE AND SHARING 21. GETTING HELP ================================================================================ 1. OVERVIEW ================================================================================ deet (Relational eXpression Graph) is a typed, versioned, testable relational programming system whose source-of-truth is a dependency graph of relational expressions, rendered through multiple lenses (grid, code, SQL), and compiled to existing execution engines. CORE PHILOSOPHY: - Type-safe transformations at compile time - Deterministic, reproducible builds - Multi-backend compilation from single source - Data contracts and quality guardrails - Version control for data pipelines - LLM-friendly structured outputs and context files KEY FEATURES: - Native TypeScript integration for type safety - Support for multiple data backends (DuckDB, PostgreSQL, BigQuery, Snowflake) - Declarative pipeline definitions - Automatic type checking and inference - Integration with cloud platforms (Vercel, AWS, GCP) - Shareable project templates - Workspace collaboration features ================================================================================ 2. CORE CONCEPTS ================================================================================ RELATION: A table-like value with schema (columns, types, nullability). The fundamental data unit in deet. NODE: An operator application that produces a relation. Nodes form vertices in the dependency graph. DAG (Directed Acyclic Graph): The dependency graph connecting all nodes. Used for: - Incremental compilation - Change detection - Optimization planning MODEL: A named node suitable for consumption by downstream transformations or as an output/export. Models are the public interface of a deet program. SOURCE: A named node representing external data. Sources are entry points for data into the deet pipeline. LENS: A visualization or interaction mode for a node. Available lenses: - Code: deet language representation - Grid: Tabular data view - SQL: Native SQL expression - Metadata: Schema and statistics MACRO: Reusable transformation patterns. Macros reduce code duplication and improve maintainability. TEST: Data-driven assertions that verify model outputs. Tests run during compilation and in CI/CD pipelines. ================================================================================ 3. LANGUAGE SPECIFICATION (.dt) ================================================================================ SYNTAX OVERVIEW: module source = table("") : { column_name: type ... } model = macro (params) = test { given: { ... } expect: { ... } } LANGUAGE CONSTRUCTS: Comments: // Single-line comment /* Multi-line comment */ String Literals: "string with double quotes" 'string with single quotes' Number Literals: 42 // Integer 3.14 // Float -100 // Negative Boolean Literals: true, false Null: null // NULL value Identifiers: Must start with letter or underscore Can contain letters, numbers, underscores Case-sensitive Reserved Keywords: source, model, macro, test, module, null, true, false, if, then, else, match, case, select, filter, derive, aggregate, join, union, except, intersect, order, limit, offset, distinct, group, window ================================================================================ 4. TYPE SYSTEM ================================================================================ PRIMITIVE TYPES: i8, i16, i32, i64 // Signed integers u8, u16, u32, u64 // Unsigned integers f32, f64 // Floating point bool // Boolean string // UTF-8 string timestamp // DateTime with timezone date // Date only time // Time only decimal(p, s) // Fixed-point decimal (precision, scale) binary // Raw bytes CONTAINER TYPES: list // Ordered collection struct // Named fields map // Key-value pairs union // Discriminated union (one of multiple types) NULLABILITY: T // Non-null type T? // Nullable type Examples: string // Non-null string string? // Nullable string list // Non-null list of integers list? // Nullable list list // List of nullable integers TYPE INFERENCE: deet infers types based on: - Source schema definitions - Operator semantics - Literal values - Function signatures Type Checking Modes: strict // Reject all type errors partial // Warn on ambiguous types minimal // Allow implicit conversions ================================================================================ 5. OPERATORS AND EXPRESSIONS ================================================================================ PIPELINE OPERATOR: |> (pipe) Passes result of left expression as input to right expression. Example: users |> filter(status = 'active') |> select(id, name, email) |> order(created_at desc) FILTERING: filter() Keeps rows where condition is true. Example: users |> filter(age > 18 AND status != 'inactive') TRANSFORMATION: select() Projects specific columns. Example: users |> select(id, name) derive(: , ...) Adds computed columns. Example: users |> derive( full_name: concat(first_name, ' ', last_name), is_adult: age >= 18 ) AGGREGATION: aggregate(, ) Computes aggregate functions. Example: orders |> aggregate( customer_id, { total_spent: sum(amount), order_count: count(*), avg_order_value: avg(amount) } ) JOINS: join(, , ) Combines rows from multiple relations. Example: users |> join( inner, orders, users.id = orders.user_id ) Join Types: inner, left, right, full, cross SET OPERATIONS: union, intersect, except Combine or compare result sets. Example: active_users |> union(newly_activated_users) ================================================================================ 6. IR AND GRAPH MODEL ================================================================================ INTERMEDIATE REPRESENTATION (IR): The internal representation of a deet program: - Normalized AST - Type-annotated nodes - Optimized expression tree - Metadata for each node DEPENDENCY GRAPH: Nodes represent operations Edges represent data flow Acyclic structure ensures: - No circular dependencies - Deterministic execution order - Efficient incremental compilation IR OPTIMIZATION: - Constant folding - Dead code elimination - Predicate pushdown - Column pruning - Common subexpression elimination SERIALIZATION: IR can be serialized to JSON for: - Transport to backend compilers - Caching and memoization - Version control and diffing - Integration with external tools ================================================================================ 7. BACKENDS AND SQL EMISSION ================================================================================ SUPPORTED BACKENDS: DuckDB (duckdb) - In-memory OLAP database - ZERO external dependencies - Perfect for local development - Excellent for analytical queries - Deployment: Single binary executable PostgreSQL (postgres) - Production-grade relational database - Complex joins and aggregations - Native JSON support - Deployment: Docker, RDS, self-hosted BigQuery (bigquery) - Cloud-native data warehouse - Petabyte-scale analytical queries - Cost-effective for large datasets - Deployment: Google Cloud Platform Snowflake (snowflake) - Cloud data platform - Optimized for concurrent queries - Cross-cloud deployment - Deployment: Multi-cloud SQL EMISSION: deet compiles to backend-specific SQL: 1. Parse .dt program 2. Build dependency graph 3. Optimize for target backend 4. Emit backend-specific SQL 5. Execute on target system SQL Features Supported: - Complex CTEs (WITH clauses) - Window functions - Lateral joins - Recursive queries (backend permitting) - UDF invocation BACKEND-SPECIFIC FEATURES: DuckDB-Specific: - LIST and STRUCT types - JSON operators - JSON_EXTRACT, JSON_TRANSFORM - Parquet/CSV native reading PostgreSQL-Specific: - JSON/JSONB operators - Full-text search - Window functions - Custom functions (PL/pgSQL) BigQuery-Specific: - ARRAY and STRUCT types - FLATTEN and CROSS JOIN for arrays - Partitioned tables - Clustering Snowflake-Specific: - Dynamic table support - Iceberg table format - Stream and task support - Native object features ================================================================================ 8. STANDARD LIBRARY ================================================================================ STRING FUNCTIONS: concat(s1, s2, ...) -> string Concatenate multiple strings length(s) -> i64 Return string length substring(s, start, length) -> string Extract substring upper(s) -> string Convert to uppercase lower(s) -> string Convert to lowercase trim(s) -> string Remove leading/trailing whitespace replace(s, find, replace) -> string Replace substring occurrences split(s, delimiter) -> list Split string by delimiter NUMERIC FUNCTIONS: abs(n) -> numeric Absolute value round(n, decimals) -> numeric Round to N decimal places ceil(n) -> i64 Ceiling function floor(n) -> i64 Floor function sqrt(n) -> f64 Square root pow(base, exponent) -> numeric Power function mod(n, divisor) -> numeric Modulo operation DATE/TIME FUNCTIONS: now() -> timestamp Current timestamp current_date() -> date Current date year(ts) -> i64 Extract year month(ts) -> i64 Extract month day(ts) -> i64 Extract day datediff(unit, from, to) -> i64 Difference between dates date_add(date, interval) -> date Add interval to date AGGREGATE FUNCTIONS: count(*) -> i64 Count rows count(col) -> i64 Count non-null values sum(col) -> numeric Sum of values avg(col) -> f64 Average of values min(col) -> same as col Minimum value max(col) -> same as col Maximum value stddev(col) -> f64 Standard deviation variance(col) -> f64 Variance percentile(col, p) -> numeric Percentile calculation array_agg(col) -> list Collect values into array string_agg(col, sep) -> string Concatenate strings with separator WINDOW FUNCTIONS: row_number() OVER (...) Sequential row numbers rank() OVER (...) Rank with gaps dense_rank() OVER (...) Rank without gaps lag(col) OVER (...) Previous row value lead(col) OVER (...) Next row value first_value(col) OVER (...) First value in window last_value(col) OVER (...) Last value in window ================================================================================ 9. POLICY PROFILES AND GUARDRAILS ================================================================================ SECURITY POLICIES: Access Control: - Row-level security (RLS) - Column-level encryption - Data masking for sensitive columns - Audit logging of all operations Data Governance: - Lineage tracking - Metadata management - Retention policies - Compliance certifications QUALITY GUARANTEES: Schema Contracts: - Explicit column definitions - Type contracts between components - Null-safety enforcement - Breaking change detection Testing: - Unit tests for transformations - Integration tests for pipelines - Data quality assertions - Regression test suites OPERATIONAL SAFEGUARDS: Determinism: - Reproducible builds - Consistent results across runs - Time-based ordering for determinism - Version pinning for dependencies Monitoring: - Performance metrics - Transformation latency - Data freshness tracking - Error rate monitoring ================================================================================ 10. CLI COMMANDS ================================================================================ PROJECT INITIALIZATION: deet init Create new project structure deet init --template Create from template VALIDATION AND COMPILATION: deet check Typecheck and lint without compiling Output: - Type errors - Unused variables - Performance warnings - Lint violations deet compile [--target ] Compile to target backend(s) Generates: - dist/compiled.sql - dist/manifest.json - dist/.sql per model EXECUTION: deet run [--limit N] Execute model and display results deet test Run test suite Output: - Pass/fail status - Coverage metrics - Performance benchmarks DEPLOYMENT: deet deploy [--target ] [--env ] Deploy to production Steps: 1. Validate configuration 2. Compile for target 3. Run smoke tests 4. Deploy to backend 5. Verify correctness deet deploy --dry-run Preview deployment without applying DEBUGGING: deet debug Interactive debugger Features: - Step through execution - Inspect intermediate values - View query plans deet explain Show execution plan Output: - Optimized SQL - Index usage - Join order - Estimated costs deet profile Measure execution performance Metrics: - Total time - Per-step timing - Memory usage - I/O statistics ================================================================================ 11. TESTING FRAMEWORK ================================================================================ TEST STRUCTURE: test { given: { : [ { col1: val1, col2: val2 }, ... ] }, expect: { : [ { col1: val1, col2: val2 }, ... ] } } ASSERTION TYPES: Exact Match: expect rows to exactly match specification Contains: expect rows containing subset of results Not Null: verify column values are non-null Count: test aggregate counts Types: verify column types RUNNING TESTS: deet test # Run all tests deet test # Run specific test deet test --watch # Re-run on changes deet test --coverage # Coverage report TEST COVERAGE: - Statement coverage - Branch coverage - Data path coverage - Integration coverage ================================================================================ 12. LSP AND EDITOR INTEGRATION ================================================================================ LANGUAGE SERVER PROTOCOL (LSP): Features: - Code completion - Hover documentation - Go to definition - Find references - Rename symbols - Quick fixes for errors - Diagnostic messages VS CODE EXTENSION: Features: - Syntax highlighting - IntelliSense - Code snippets - Integrated terminal - Preview pane - Theme integration SNIPPETS: source snippet: source ${1:name} = table("${2:table_name}") : { ${3:column}: ${4:type} } model snippet: model ${1:name} = ${2:source} |> filter(${3:condition}) join snippet: |> join( ${1:inner}, ${2:right_table}, ${3:left.id} = ${4:right.id} ) KEYBOARD SHORTCUTS: Ctrl+Space Code completion Ctrl+Shift+Space Function signature F12 Go to definition Ctrl+H Find and replace Alt+F12 Open definition Ctrl+K Ctrl+X Format code Ctrl+Shift+B Build ================================================================================ 13. CANONICAL ARTIFACTS ================================================================================ PROJECT MANIFEST (manifest.json): { "name": "string", "version": "semver", "generated": "ISO8601", "targets": { "backend": { "sql": "string", "parameters": {}, "dependencies": [], "models": [] } }, "sources": [ { "name": "string", "table": "string", "schema": { "columns": [] } } ], "models": [ { "name": "string", "query": "string", "schema": { "columns": [] } } ], "tests": [ { "name": "string", "status": "pass|fail", "assertions": [] } ], "metadata": { "build_time_ms": number, "source_hash": "string", "compiler_version": "string" } } DEPENDENCY GRAPH (graph.json): { "nodes": [ { "id": "string", "type": "source|model|macro", "name": "string", "dependencies": ["id"], "metadata": {} } ], "edges": [ { "from": "string", "to": "string", "label": "string" } ] } SCHEMA ARTIFACTS: dist/.schema.json { "model": "string", "columns": [ { "name": "string", "type": "string", "nullable": boolean } ] } ================================================================================ 14. BUILD AND DEPLOYMENT ================================================================================ BUILD PROCESS: 1. Parsing - Tokenize .dt source - Build AST - Syntax validation 2. Type Checking - Infer column types - Verify type safety - Check nullability 3. Optimization - Constant folding - Dead code elimination - Predicate pushdown - Column pruning 4. Code Generation - Backend-specific SQL - Parameter binding - Index hints - Explain output 5. Validation - Semantic checks - Policy enforcement - Performance hints - Security analysis DEPLOYMENT PROCESS: 1. Pre-deployment Checks - Configuration validation - Credentials verification - Backend connectivity - Schema compatibility 2. Staging - Build for target backend - Create deployment package - Run smoke tests - Capture metrics baseline 3. Execution - Execute migration scripts - Deploy models - Verify results - Update metadata 4. Post-deployment - Health checks - Monitor performance - Alert on anomalies - Enable rollback capability ROLLBACK: deet deploy --rollback --version - Restore previous version - Verify correctness - Monitor metrics ================================================================================ 15. PROJECT CONTEXT (.deet) ================================================================================ The .deet file is a JSON context snapshot for LLMs and tooling. It captures: - Project metadata and versions - Directory layout + models - Data sources and required env vars (names only) - Build targets and policy profile - Optional goals and KPIs GENERATE: deet context deet context path/to/.deet deet init my-project --context CONTEXT TEMPLATE: { "project": { "name": "analytics-platform", "description": "Core transformations for product analytics", "version": "0.1.0", "rxg_version": "0.2.0", "created_at": "2026-01-04T00:00:00.000Z", "last_updated": "2026-01-04T00:00:00.000Z" }, "directory_structure": { "root": "/path/to/analytics-platform", "source_dir": "src", "test_dir": "tests", "output_dir": "dist" }, "data_sources": [], "models": [], "dependencies": { "std": "^0.2.0" }, "environment": { "variables": [] }, "build_config": { "target": "duckdb", "enabled_targets": ["duckdb"], "policy_profile": "dev" }, "metrics": {}, "development": { "framework": "deet/RXG", "package_manager": "npm", "test_framework": "DuckDB" } } NOTES: - No secret values are stored; sensitive env vars are marked. - Update after adding sources, models, or build targets. ================================================================================ 16. ERROR HANDLING AND DIAGNOSTICS ================================================================================ STRUCTURED ERROR FORMAT: All API errors follow this structure: { "code": "ERROR_CODE", "message": "Human-readable error message", "suggestions": [ "Actionable suggestion 1", "Actionable suggestion 2" ], "context": { "field": "value", "line": 42, "column": 10 }, "timestamp": "2024-01-04T12:00:00Z" } COMMON ERROR CODES: UNAUTHORIZED (401) User not authenticated or token expired Suggestions: - Log in with your account - Refresh the page and try again - Check that session is valid DATABASE_ERROR (500) Database operation failed Suggestions: - Check database connection string - Verify database server is running - Review database logs VALIDATION_ERROR (400) Invalid input provided Suggestions: - Review API documentation - Check field formats - Ensure all required fields present NOT_FOUND (404) Requested resource does not exist Suggestions: - Verify resource ID - Check resource was created - Confirm access permissions COMPILER_ERROR (400) Source code compilation failed Suggestions: - Check .dt syntax - Run `deet check` for diagnostics - Review type annotations - Verify all sources defined TYPE_ERROR (400) Type mismatch in transformation Suggestions: - Check column types match - Review null-safety constraints - Use type casts if needed MISSING_CONFIG (500) Required configuration missing Suggestions: - Create rxg.toml in project root - Define backend connection in rxg.toml - Run deet init to generate defaults INVALID_SYNTAX (400) Language syntax error Suggestions: - Review language documentation - Check bracket matching - Look at example code ================================================================================ 17. FILE STRUCTURE ================================================================================ Typical Project Layout: project-root/ ├── .deet # LLM context snapshot ├── rxg.toml # Package manifest ├── src/ │ ├── sources/ │ │ └── users.dt # User data source │ ├── models/ │ │ ├── active_users.dt # Active users model │ │ └── user_metrics.dt # Metrics model │ └── macros/ │ └── date_utils.dt # Reusable macros ├── tests/ │ ├── active_users_test.dt │ └── metrics_test.dt ├── dist/ │ ├── compiled.sql # Final compiled SQL │ ├── manifest.json # Build manifest │ ├── graph.json # Dependency graph │ └── models/ │ ├── active_users.sql │ └── user_metrics.sql ├── docs/ │ └── README.md └── .gitignore FILE PURPOSES: .deet JSON context snapshot for LLMs and tooling Captures: metadata, layout, models, build targets rxg.toml Package manifest with dependencies and metadata Defines: name, version, dependencies, exports src/ Source code directory containing .dt files dist/ Build output directory Generated during compilation Should not be committed to version control tests/ Test definitions and fixtures docs/ Documentation files (optional) ================================================================================ 18. CODE EXAMPLES ================================================================================ BASIC PIPELINE: module ecommerce source orders = table("orders") : { id: i64 customer_id: i64 amount: f64 status: string created_at: timestamp } model completed_orders = orders |> filter(status = 'completed') model order_summary = orders |> aggregate( customer_id, { total_spent: sum(amount), order_count: count(*), last_order: max(created_at) } ) TYPE-SAFE TRANSFORMATIONS: model customer_lifetime_value = orders |> filter(status = 'completed') |> aggregate( customer_id, { clv: sum(amount), order_count: count(*), first_order: min(created_at), last_order: max(created_at) } ) |> derive( days_active: datediff('day', first_order, last_order), avg_order_value: clv / order_count, status: if(clv > 10000, 'vip', 'standard') ) COMPLEX JOINS: source customers = table("customers") : { id: i64 name: string email: string? created_at: timestamp } source orders = table("orders") : { id: i64 customer_id: i64 amount: f64 } model customer_orders = customers |> join( left, orders, customers.id = orders.customer_id ) |> select( customers.id, customers.name, orders.amount ) TESTING EXAMPLE: test active_users_test { given: { users: [ { id: 1, name: 'Alice', status: 'active', email: 'alice@example.com' }, { id: 2, name: 'Bob', status: 'inactive', email: null } ] }, expect: { active_users: [ { id: 1, name: 'Alice', email: 'alice@example.com' } ] } } ================================================================================ 19. API REFERENCE ================================================================================ Project Endpoints: GET /api/projects List user projects Response: { projects: Project[] } POST /api/projects Create new project Body: { name, description?, github_repo?, github_branch? } Response: { project: Project } (201) GET /api/projects/[id] Get project details Response: { project: Project } Template Endpoints: GET /api/templates List available templates Response: { templates: Template[] } GET /api/templates/[id] Get template details Response: { template: Template } POST /api/templates/[id]/clone Clone template to workspace Body: { project_name, target_workspace_id? } Response: { project: Project } (201) Pipeline Endpoints: POST /api/projects/[id]/pipeline-runs Start pipeline run Body: { model?, dry_run? } Response: { run: PipelineRun } (201) GET /api/pipeline-runs/[id] Get run details Response: { run: PipelineRun, logs: string[] } Preview/Development: POST /api/preview-runs/submit Submit preview run Body: { code, backend, model? } Response: { run: PreviewRun } (201) GET /api/preview-runs/[id] Get preview run results Response: { run: PreviewRun, result: QueryResult } ================================================================================ 20. WORKSPACE AND SHARING ================================================================================ WORKSPACE FEATURES: Project Sharing: - Share projects with team members - Set granular permissions (view, edit, execute) - Track who accessed what and when - Revoke access at any time Templates: - Create reusable project templates - Share templates with organization - Clone templates to create new projects - Customize for specific use cases Collaboration: - Real-time code editing (coming soon) - Comments and suggestions on models - Activity feed and notifications - Version history and rollback PERMISSIONS: View - Read project definition - View results - Cannot modify Execute - Run pipelines - Preview models - Cannot modify definitions Edit - Modify project definition - Commit changes - Run deployments Admin - Manage project settings - Control sharing and permissions - Delete project ================================================================================ 21. GETTING HELP ================================================================================ DOCUMENTATION: Online Docs: https://deet.sh/docs - Tutorials - Language reference - API documentation - Best practices Examples: https://deet.sh/templates - Pre-built templates - Code examples - Integration examples COMMUNITY: GitHub Issues: https://github.com/Hmbown/deet/issues - Bug reports - Feature requests - Community discussions Discord: https://discord.gg/deet - Live chat with team - Peer support - Office hours CONTACT: Email: support@deet.sh For technical support and questions Sales: sales@deet.sh For enterprise licenses and deployments STATUS & ROADMAP: Current Version: 0.8.0 Status: Feature Complete Next: 1.0.0 production release Recent Features: - Template sharing and cloning - Workspace collaboration - Cloud deployments - Error diagnostics Upcoming Features: - Real-time collaboration - Advanced monitoring - Custom functions - Machine learning integration ================================================================================ END OF DOCUMENTATION ================================================================================ Last Updated: 2026-02-26T09:31:53.146Z For the latest version, visit: https://deet.sh/llms-full.txt