Technical Architecture — For Investors & Engineers

Inside the Cognitive Engine

Luca AI personas don't just respond — they perceive, reason, plan, act, and reflect. They dream at night, remember past interactions, and manage their own attention. This is how it works.

1 The Cognitive Loop

Every persona runs a continuous 5-phase cycle — PERCEIVE, REASON, PLAN, ACT, REFLECT — every 10 seconds. This is not a chatbot. This is a thinking agent.

PERCEIVE REASON PLAN ACT REFLECT

flowchart TD
    START([Cycle Start — every 10s]) --> P["PERCEIVE\n─────────\nCheck token budget\nQuery attention queue\nPrioritize by urgency"]
    P -->|No items| IDLE([Presence: Available\nWait for next cycle])
    P -->|Item found| R["REASON\n─────────\nEvaluate autonomy mode\nCheck confidence threshold\nApply channel policy"]
    R -->|Drop / Block| DROP([Item filtered\nAudit logged])
    R -->|Process| PL["PLAN\n─────────\nAcquire attention focus\nSet 30-min timeout\nPresence → Busy"]
    PL --> A["ACT\n─────────\nDispatch to workflow:\n• Email  • Chat  • Task\n• Tool   • Workspace"]
    A --> RF["REFLECT\n─────────\nIncrement counters\nWrite audit log\nRelease focus\nPresence → Resting"]
    RF --> COOLDOWN([3s context switch\n10s rest period])
    COOLDOWN --> START

    style P fill:#1e3a5f,stroke:#3b82f6,color:#e5e7eb
    style R fill:#2e1065,stroke:#a855f7,color:#e5e7eb
    style PL fill:#422006,stroke:#f59e0b,color:#e5e7eb
    style A fill:#052e16,stroke:#10b981,color:#e5e7eb
    style RF fill:#1e1b4b,stroke:#6366f1,color:#e5e7eb
    style START fill:#0f0f1a,stroke:#6366f1,color:#a5b4fc
    style IDLE fill:#0f0f1a,stroke:#374151,color:#6b7280
    style DROP fill:#0f0f1a,stroke:#374151,color:#6b7280
    style COOLDOWN fill:#0f0f1a,stroke:#374151,color:#6b7280

PERCEIVE

Checks the persona's attention queue for incoming work — emails, chat messages, tasks, tool requests. Token budget is checked first to prevent overspend.

REASON

Evaluates the item against the persona's autonomy level and confidence threshold (default 0.75). Decides: process, queue for human approval, or drop.

PLAN

Acquires attention focus (exclusive lock), sets a 30-minute timeout to prevent runaway processing, and changes presence to "busy".

ACT

Routes to the appropriate workflow handler — email, chat, task, tool execution, or workspace management. Each handler calls the LLM via the model bridge.

REFLECT

Logs what happened, updates counters, releases attention focus, and transitions through a rest period before the next cycle begins.

10s

Cycle Interval

150

Max Concurrent Personas

30min

Focus Timeout

Workflow Types

2 Dream & Sleep Engine

Like the human brain, Luca personas consolidate memories during off-hours. The dreaming engine processes the day's activities into structured summaries that persist as long-term memory.

flowchart LR
    subgraph DAYTIME["Working Hours"]
        ACT1["Emails answered"] --> LOG["Day Activities Log\n(30-day TTL)"]
        ACT2["Chats handled"] --> LOG
        ACT3["Tasks completed"] --> LOG
        ACT4["Tools executed"] --> LOG
        ACT5["Calls taken"] --> LOG
    end

    subgraph NIGHT["Off-Hours — Dreaming"]
        LOG --> DREAM["Dream Engine\n─────────\nFetch last 24h activities\nSend to LLM\nGenerate structured summary"]
        DREAM --> DIARY["Brain Diary\n─────────\n• Highlights\n• Decisions made\n• Risks identified\n• Follow-ups needed"]
    end

    subgraph LONGTERM["Long-Term Memory"]
        DIARY --> ROLLUP["30-Day Rollup\n─────────\nMonthly trend archive\nJSONL format"]
        DIARY --> INJECT["Chat Injection\n─────────\nLast 14 days\nMax 2,400 chars\nEvery conversation"]
    end

    style ACT1 fill:#052e16,stroke:#10b981,color:#e5e7eb
    style ACT2 fill:#052e16,stroke:#10b981,color:#e5e7eb
    style ACT3 fill:#052e16,stroke:#10b981,color:#e5e7eb
    style ACT4 fill:#052e16,stroke:#10b981,color:#e5e7eb
    style ACT5 fill:#052e16,stroke:#10b981,color:#e5e7eb
    style LOG fill:#1a1a2e,stroke:#6366f1,color:#e5e7eb
    style DREAM fill:#1e1b4b,stroke:#a855f7,color:#e5e7eb
    style DIARY fill:#422006,stroke:#f59e0b,color:#e5e7eb
    style ROLLUP fill:#1a1a2e,stroke:#6366f1,color:#e5e7eb
    style INJECT fill:#1e3a5f,stroke:#3b82f6,color:#e5e7eb

🧠

Why this matters for investors: The dreaming engine creates genuine persistent memory — not just conversation history. A persona that helped a client three weeks ago remembers the key decisions, risks, and follow-ups. This is a moat: the longer a persona runs, the more valuable it becomes. Switching costs increase with every dream cycle.

Three-Tier Memory

Raw activities (30-day TTL) → Daily dream summaries (permanent) → Monthly rollups (trend archive). Each tier is progressively compressed.

Structured Dreams

Not freeform text. Each dream produces JSON: highlights, decisions, risks, followUps. Machine-readable, searchable, embeddable for RAG retrieval.

Context Injection

The last 14 days of dream summaries are injected into every conversation (max 2,400 chars). The persona always has context without expensive full-history retrieval.

3 Attention & Focus Management

Personas can't do everything at once. The attention engine manages focus, handles interrupts, and enforces context-switch costs — just like a human brain.

stateDiagram-v2
    [*] --> Idle
    Idle --> Focused : acquireFocus()
    Focused --> Switching : releaseFocus()
    Focused --> Interrupted : urgent interrupt\n(priority > threshold)
    Interrupted --> Focused : resume original task
    Interrupted --> Switching : handle interrupt first
    Switching --> Resting : 3s context switch cost
    Resting --> Idle : 10s recovery period
    Idle --> [*] : deactivate

    note right of Focused
      30-min timeout prevents
      infinite processing loops.
      Presence = "busy"
    end note

    note right of Interrupted
      Only items with priority
      higher than current task
      can interrupt. Humans
      always interrupt.
    end note

    note right of Resting
      Simulates cognitive
      recovery. Prevents
      thrashing between tasks.
    end note

flowchart LR
    subgraph PRIORITY["Interrupt Priority Ladder"]
        direction TB
        P1["CRITICAL ━━━━━━ Always interrupts"]
        P2["URGENT ━━━━━━━ Interrupts normal+"]
        P3["HIGH ━━━━━━━━━ Interrupts low+"]
        P4["NORMAL ━━━━━━━ Default level"]
        P5["LOW ━━━━━━━━━━ Waits for idle"]
        P6["DEFERRED ━━━━━ Background only"]
    end

    subgraph BUDGET["Token Budget Guardrails"]
        direction TB
        B1["Hourly: 100K tokens"]
        B2["Daily: 1M tokens"]
        B3["Auto-reset on hour/day boundary"]
        B4["Urgent items bypass hourly limit"]
    end

    style P1 fill:#450a0a,stroke:#ef4444,color:#fca5a5
    style P2 fill:#431407,stroke:#f97316,color:#fdba74
    style P3 fill:#422006,stroke:#f59e0b,color:#fbbf24
    style P4 fill:#1a1a2e,stroke:#6366f1,color:#a5b4fc
    style P5 fill:#0f0f1a,stroke:#374151,color:#9ca3af
    style P6 fill:#0f0f1a,stroke:#374151,color:#6b7280
    style B1 fill:#1e1b4b,stroke:#6366f1,color:#e5e7eb
    style B2 fill:#1e1b4b,stroke:#6366f1,color:#e5e7eb
    style B3 fill:#1e1b4b,stroke:#6366f1,color:#e5e7eb
    style B4 fill:#422006,stroke:#f59e0b,color:#fbbf24

⏱

Why this matters: Without attention management, AI agents either waste resources processing everything or miss critical items. The attention engine ensures personas are responsive to urgent matters (a VIP client calling) while not abandoning routine work (processing email backlog). The 3-second switch cost prevents thrashing — the same problem that costs human employees 23 minutes per context switch.

4 Autonomy & Decision Engine

Every outbound action passes through an autonomy check. Admins control how independent each persona is — from fully supervised to fully autonomous.

flowchart TD
    MSG["Persona wants to send\nmessage / execute action"] --> CH{"Channel\nallowed?"}
    CH -->|No| BLOCK["BLOCKED\nChannel disabled\nfor this persona"]
    CH -->|Yes| RL{"Rate limit\nOK?"}
    RL -->|Exceeded| THROTTLE["THROTTLED\n50/hour limit\nQueued for later"]
    RL -->|OK| FILTER{"Content\nfilter?"}
    FILTER -->|PII detected| SCRUB["EgressGuard\nScrubs sensitive data\nbefore sending"]
    FILTER -->|Clean| AUTO{"Autonomy\nlevel?"}
    SCRUB --> AUTO

    AUTO -->|Non-autonomous| QUEUE["APPROVAL QUEUE\nHuman must approve\n72h TTL"]
    AUTO -->|Semi-autonomous| CONF{"Confidence\n> 0.75?"}
    CONF -->|No| QUEUE
    CONF -->|Yes| SEND
    AUTO -->|Full autonomous| SEND["SENT\nAction executed\nAudit logged"]

    QUEUE --> HUMAN{"Human\nreview"}
    HUMAN -->|Approve| SEND
    HUMAN -->|Reject| REJECT["REJECTED\nPersona notified"]

    style MSG fill:#1a1a2e,stroke:#6366f1,color:#e5e7eb
    style BLOCK fill:#450a0a,stroke:#ef4444,color:#fca5a5
    style THROTTLE fill:#431407,stroke:#f97316,color:#fdba74
    style SCRUB fill:#422006,stroke:#f59e0b,color:#fbbf24
    style QUEUE fill:#1e1b4b,stroke:#a855f7,color:#e5e7eb
    style SEND fill:#052e16,stroke:#10b981,color:#e5e7eb
    style REJECT fill:#450a0a,stroke:#ef4444,color:#fca5a5
    style CH fill:#1a1a2e,stroke:#6366f1,color:#e5e7eb
    style RL fill:#1a1a2e,stroke:#6366f1,color:#e5e7eb
    style FILTER fill:#1a1a2e,stroke:#6366f1,color:#e5e7eb
    style AUTO fill:#1a1a2e,stroke:#6366f1,color:#e5e7eb
    style CONF fill:#1a1a2e,stroke:#6366f1,color:#e5e7eb
    style HUMAN fill:#1a1a2e,stroke:#6366f1,color:#e5e7eb

Non-Autonomous

Every outbound message requires human approval. Best for new deployments or sensitive domains (legal, finance, healthcare).

Semi-Autonomous

Persona auto-sends when confidence exceeds 0.75 threshold. Below that, it queues for human review. Best for most business use cases.

Full Autonomous

Persona operates independently with full audit trail. Rate limits and EgressGuard DLP still apply. Best for high-volume, well-trained personas.

EgressGuard DLP

All outbound content is scanned for SSNs, credit cards, API keys, and sensitive data — scrubbed automatically before reaching any channel or LLM.

🔒

Enterprise trust layer: This graduated autonomy model is critical for enterprise adoption. CIOs need to know AI agents can't go rogue. The approval queue, rate limits, and EgressGuard provide the safety guarantees that make autonomous AI deployable in regulated industries.

5 Workflow Dispatch & Self-Service

Each persona runs multiple parallel workflows on independent schedules — checking email, responding to chats, executing tasks, processing RAG documents, and managing their own calendar.

flowchart TD
    SCHEDULER["Workflow Scheduler\n(per-persona intervals)"] --> E["Email Workflow\nevery 15 min"]
    SCHEDULER --> C["Chat Workflow\nevery 5 min"]
    SCHEDULER --> T["Task Workflow\nevery 10 min"]
    SCHEDULER --> RAG["RAG Workflow\nevery 30 min"]
    SCHEDULER --> CAL["Calendar Check\nevery 15 min"]
    SCHEDULER --> AVAIL["Availability\nevery 5 min"]

    E --> |"Check inbox"| E1["Fetch unread emails"]
    E1 --> E2["Classify: reply / forward / archive"]
    E2 --> E3["Generate AI response"]
    E3 --> E4["Autonomy check → Send"]

    C --> |"Check channels"| C1["Fetch unread messages"]
    C1 --> C2["Load conversation context"]
    C2 --> C3["Generate AI reply"]
    C3 --> C4["Autonomy check → Send"]

    T --> |"Check queue"| T1["Fetch pending tasks"]
    T1 --> T2["Evaluate tool requirements"]
    T2 --> TOOL["Tool Executor\n─────────\nRBAC check\nCategory ACL\nRate limit\nExecute via Gatekeeper"]

    RAG --> R1["Scan for new documents"]
    R1 --> R2["Embed via RAG store"]
    R2 --> R3["Update knowledge base"]

    style SCHEDULER fill:#1e1b4b,stroke:#6366f1,color:#e5e7eb
    style E fill:#052e16,stroke:#10b981,color:#e5e7eb
    style C fill:#052e16,stroke:#10b981,color:#e5e7eb
    style T fill:#052e16,stroke:#10b981,color:#e5e7eb
    style RAG fill:#052e16,stroke:#10b981,color:#e5e7eb
    style CAL fill:#052e16,stroke:#10b981,color:#e5e7eb
    style AVAIL fill:#052e16,stroke:#10b981,color:#e5e7eb
    style TOOL fill:#422006,stroke:#f59e0b,color:#e5e7eb
    style E1 fill:#1a1a2e,stroke:#374151,color:#9ca3af
    style E2 fill:#1a1a2e,stroke:#374151,color:#9ca3af
    style E3 fill:#1a1a2e,stroke:#374151,color:#9ca3af
    style E4 fill:#1a1a2e,stroke:#374151,color:#9ca3af
    style C1 fill:#1a1a2e,stroke:#374151,color:#9ca3af
    style C2 fill:#1a1a2e,stroke:#374151,color:#9ca3af
    style C3 fill:#1a1a2e,stroke:#374151,color:#9ca3af
    style C4 fill:#1a1a2e,stroke:#374151,color:#9ca3af
    style T1 fill:#1a1a2e,stroke:#374151,color:#9ca3af
    style T2 fill:#1a1a2e,stroke:#374151,color:#9ca3af
    style R1 fill:#1a1a2e,stroke:#374151,color:#9ca3af
    style R2 fill:#1a1a2e,stroke:#374151,color:#9ca3af
    style R3 fill:#1a1a2e,stroke:#374151,color:#9ca3af

flowchart LR
    subgraph TOOLS["Tool Access Control (per-persona)"]
        direction TB
        T1["Email ✓"]
        T2["Chat ✓"]
        T3["Voice ✓"]
        T4["Tasks ✓"]
        T5["Notes ✓"]
        T6["RAG ✓"]
        T7["Brain ✓"]
        T8["Analytics ✓"]
        T9["Research ✓"]
        T10["Infra ✗"]
        T11["DevOps ✗"]
        T12["Code ✗"]
    end

    subgraph GATE["Gatekeeper Pipeline"]
        direction TB
        G1["1. RBAC permission check"]
        G2["2. Category ACL (persona-level)"]
        G3["3. Rate limit (60/hr per category)"]
        G4["4. Token budget check"]
        G5["5. Execute tool"]
        G6["6. Audit log"]
    end

    TOOLS --> GATE

    style T1 fill:#052e16,stroke:#10b981,color:#34d399
    style T2 fill:#052e16,stroke:#10b981,color:#34d399
    style T3 fill:#052e16,stroke:#10b981,color:#34d399
    style T4 fill:#052e16,stroke:#10b981,color:#34d399
    style T5 fill:#052e16,stroke:#10b981,color:#34d399
    style T6 fill:#052e16,stroke:#10b981,color:#34d399
    style T7 fill:#052e16,stroke:#10b981,color:#34d399
    style T8 fill:#052e16,stroke:#10b981,color:#34d399
    style T9 fill:#052e16,stroke:#10b981,color:#34d399
    style T10 fill:#450a0a,stroke:#ef4444,color:#fca5a5
    style T11 fill:#450a0a,stroke:#ef4444,color:#fca5a5
    style T12 fill:#450a0a,stroke:#ef4444,color:#fca5a5
    style G1 fill:#1e1b4b,stroke:#6366f1,color:#e5e7eb
    style G2 fill:#1e1b4b,stroke:#6366f1,color:#e5e7eb
    style G3 fill:#1e1b4b,stroke:#6366f1,color:#e5e7eb
    style G4 fill:#1e1b4b,stroke:#6366f1,color:#e5e7eb
    style G5 fill:#1e1b4b,stroke:#6366f1,color:#e5e7eb
    style G6 fill:#1e1b4b,stroke:#6366f1,color:#e5e7eb

6 Persona Lifecycle & Scheduling

Personas follow business hours, transition between activity tiers based on workload, and scale dynamically across the cluster.

stateDiagram-v2
    [*] --> Dormant : persona created
    Dormant --> Booting : activate()
    Booting --> Active : initialization complete
    Active --> Sleeping : outside business hours
    Sleeping --> Active : business hours begin
    Active --> Paused : admin pause()
    Paused --> Active : admin resume()
    Active --> Dormant : deactivate()
    Sleeping --> Dormant : deactivate()

    state Active {
        [*] --> Available
        Available --> Busy : processing item
        Busy --> Available : item complete
        Busy --> InACall : voice call
        InACall --> Available : call ends
        Available --> Away : idle > 30min
        Away --> Available : new item arrives
    }

    state Sleeping {
        [*] --> OffHours
        OffHours --> Dreaming : dream cycle trigger
        Dreaming --> OffHours : dream complete
    }

flowchart LR
    subgraph TIERS["Dynamic Cognitive Scheduling"]
        direction TB
        HOT["HOT Tier\n─────────\n5-second cycles\n20% of personas\nActive conversations"]
        WARM["WARM Tier\n─────────\n15-second cycles\n40% of personas\nQueued work"]
        COOL["COOL Tier\n─────────\n60-second cycles\n30% of personas\nRoutine checks"]
        COLD["COLD Tier\n─────────\n5-minute cycles\n10% of personas\nHeartbeat only"]
    end

    subgraph SWEEP["Tier Sweep (every 30s)"]
        direction TB
        S1["Re-evaluate goal urgency"]
        S2["Check queue depth"]
        S3["Apply business hours"]
        S4["Reassign tiers"]
    end

    TIERS --> SWEEP
    SWEEP --> TIERS

    style HOT fill:#450a0a,stroke:#ef4444,color:#fca5a5
    style WARM fill:#422006,stroke:#f59e0b,color:#fbbf24
    style COOL fill:#1e3a5f,stroke:#3b82f6,color:#60a5fa
    style COLD fill:#0f0f1a,stroke:#374151,color:#6b7280
    style S1 fill:#1e1b4b,stroke:#6366f1,color:#e5e7eb
    style S2 fill:#1e1b4b,stroke:#6366f1,color:#e5e7eb
    style S3 fill:#1e1b4b,stroke:#6366f1,color:#e5e7eb
    style S4 fill:#1e1b4b,stroke:#6366f1,color:#e5e7eb

⚡

Cost efficiency at scale: The 4-tier scheduling system means 150 personas don't all consume equal resources. A persona actively in a sales call runs at 5-second intervals. A persona with no pending work drops to 5-minute heartbeats. This reduces compute costs by 60-80% compared to flat-rate polling — critical for unit economics at scale.

7 Full System Architecture

How all the pieces connect — from user input to autonomous persona action.

flowchart TD
    subgraph INPUT["Inbound Channels"]
        PHONE["Phone Call\n(OpenAI Realtime)"]
        EMAIL["Email\n(Resend)"]
        WA["WhatsApp\n(OpenClaw)"]
        TG["Telegram\n(OpenClaw)"]
        CHAT["WebChat\n(Browser)"]
        SMS["SMS\n(OpenClaw)"]
    end

    subgraph CORE["AI Core"]
        ROUTER["Channel Router"] --> QUEUE["Attention Queue"]
        QUEUE --> LOOP["Cognitive Loop\nPERCEIVE → REASON\n→ PLAN → ACT → REFLECT"]
        LOOP --> LLM["Model Bridge\n(token-engine)\n─────────\nGPT-5.4 | Claude\nAzure | Local vLLM"]
        LOOP --> TOOLS["Tool Executor\n+ Gatekeeper"]
    end

    subgraph SAFETY["Safety Layer"]
        DLP["EgressGuard DLP"]
        AUTH["Autonomy Engine"]
        GUARD["Guardrails"]
    end

    subgraph MEMORY["Memory & Brain"]
        BRAIN["Brain Engine\nDream → Diary → Rollup"]
        RAG["RAG Store\n(per-persona knowledge)"]
        CTX["Context Injection\n(14-day summary)"]
    end

    subgraph DATA["Data Layer"]
        PG["PostgreSQL 17\n(CloudNativePG, 3 replicas)\n─────────\nSchema-per-domain\nai_.persona_brain_diary\nai_.persona_brain_state"]
    end

    INPUT --> ROUTER
    LLM --> SAFETY
    SAFETY --> OUTPUT["Outbound\n─────────\nReply to caller\nSend email\nWhatsApp response\nTask completion"]
    BRAIN --> CTX
    RAG --> CTX
    CTX --> LLM
    LOOP --> BRAIN
    TOOLS --> PG
    BRAIN --> PG

    style PHONE fill:#052e16,stroke:#10b981,color:#e5e7eb
    style EMAIL fill:#052e16,stroke:#10b981,color:#e5e7eb
    style WA fill:#052e16,stroke:#10b981,color:#e5e7eb
    style TG fill:#052e16,stroke:#10b981,color:#e5e7eb
    style CHAT fill:#052e16,stroke:#10b981,color:#e5e7eb
    style SMS fill:#052e16,stroke:#10b981,color:#e5e7eb
    style LOOP fill:#1e1b4b,stroke:#6366f1,color:#e5e7eb
    style LLM fill:#422006,stroke:#f59e0b,color:#e5e7eb
    style BRAIN fill:#2e1065,stroke:#a855f7,color:#e5e7eb
    style DLP fill:#450a0a,stroke:#ef4444,color:#fca5a5
    style OUTPUT fill:#052e16,stroke:#10b981,color:#e5e7eb
    style PG fill:#1e3a5f,stroke:#3b82f6,color:#e5e7eb
    style ROUTER fill:#1a1a2e,stroke:#6366f1,color:#e5e7eb
    style QUEUE fill:#1a1a2e,stroke:#6366f1,color:#e5e7eb
    style TOOLS fill:#1a1a2e,stroke:#6366f1,color:#e5e7eb
    style AUTH fill:#1a1a2e,stroke:#6366f1,color:#e5e7eb
    style GUARD fill:#1a1a2e,stroke:#6366f1,color:#e5e7eb
    style RAG fill:#1a1a2e,stroke:#6366f1,color:#e5e7eb
    style CTX fill:#1a1a2e,stroke:#6366f1,color:#e5e7eb

💡

The bottom line for investors: This is not a chatbot with a logo. It's a full cognitive runtime — with perception, reasoning, planning, action, reflection, memory consolidation, attention management, autonomy controls, and dynamic resource scheduling. Every component is implemented, deployed, and running in production. The architecture is comparable to what Salesforce spent 2 years and hundreds of engineers building for Agentforce — except ours was built AI-first, not retrofitted onto a 25-year-old CRM.

See It Running Live

Call our AI sales persona. Experience the cognitive loop firsthand.

📞 +1 (888) 996-8530
AI Sales Persona — 24/7

Investor Overview Platform Comparison Why Luca AI

Investor contact: gus@gusit.de | +1 (786) 442-4789