Architecture¶

This page describes the high-level architecture of the OpenMIA platform.

System Overview¶

graph TB
    subgraph Client
        A[Web Browser / Mobile App]
    end

    subgraph API Gateway
        B[FastAPI Server]
    end

    subgraph Core Services
        C[Auth Service]
        D[AI Inference Engine]
        E[Memory Manager]
    end

    subgraph Data Layer
        F[(PostgreSQL)]
        G[(Redis Cache)]
        H[(Object Storage)]
    end

    A -->|HTTPS| B
    B --> C
    B --> D
    B --> E
    D --> F
    D --> G
    E --> F
    E --> H

Component Descriptions¶

API Gateway¶

The FastAPI-based gateway handles all incoming HTTP requests, performs authentication via JWT tokens, and routes requests to the appropriate internal service.

AI Inference Engine¶

The core intelligence layer. It loads and serves transformer-based models for:

Natural Language Understanding — intent classification, entity extraction
Multimodal Processing — vision + language fusion
Memory-Augmented Generation — context-aware response generation

Memory Manager¶

Handles long-term and short-term memory storage:

Short-term: Redis-backed session context
Long-term: PostgreSQL with vector similarity search

Deployment Topology¶

graph LR
    subgraph Production
        LB[Load Balancer] --> W1[Worker 1]
        LB --> W2[Worker 2]
        LB --> W3[Worker 3]
        W1 --> DB[(Database)]
        W2 --> DB
        W3 --> DB
    end

Component	Scaling Strategy
API Workers	Horizontal (Gunicorn)
Inference	GPU-bound, vertical
Database	Primary + Read Replicas
Cache	Redis Cluster

Security¶

All external communication over TLS 1.3
API keys rotated every 90 days
Rate limiting: 100 req/min per API key
Input sanitization on all endpoints

Production Checklist

Before deploying to production, ensure all environment variables are set and secrets are stored in a secure vault (e.g., GitHub Secrets, HashiCorp Vault).