Skip to content

Architecture

The Data-Continuum architecture is built on a Polyglot Persistence pattern with an Event-Driven Orchestration layer.

Architecture Diagram

graph TD
    subgraph Observability
        Prometheus[Prometheus]
        Grafana[Grafana]
    end

    subgraph Data_Layer [Data Layer]
        DB[(PostgreSQL)]
        Mongo[(MongoDB)]
        Redis[(Redis)]
    end

    subgraph Workflow_Orchestration [Workflow & Orchestration]
        AirflowWeb[Airflow Webserver]
        AirflowSched[Airflow Scheduler]
    end

    subgraph Service_ML_Layer [Service & ML Layer]
        API[API Service]
        ML[ML Service]
        MLFlow[MLFlow Tracking]
        Seeder[Seeder]
        UI[React UI Dashboard]
    end

    UI --> API

    API --> DB
    API --> Mongo
    Seeder --> DB
    Seeder --> Mongo

    AirflowWeb --> DB
    AirflowSched --> DB
    AirflowSched --> Redis

    ML --> MLFlow

    Grafana --> Prometheus

System Components

1. Data Ingestion Layer (The Source)

  • Relational Storage (SQL): PostgreSQL 15+ storing transactional shipment data (IDs, Customer info, Status, timestamps).
  • Non-Relational Storage (NoSQL): MongoDB storing high-frequency vehicle telemetry (GPS coordinates, engine sensors, fuel levels).

2. Orchestration & Engineering Layer (The Brain)

  • Workflow Engine: Apache Airflow.
  • Core Responsibilities:
    • Monitor database health sensors.
    • Trigger periodic PySpark jobs for data cleaning.
    • Handle retry logic for failed extraction tasks.

3. Service & ML Layer (The Interface)

  • Unified Extraction API: FastAPI service to retrieve "Unified Shipment State".
  • ML Training Service: FastAPI endpoint to trigger training of a Predictive ETA model.
  • Experiment Tracking: MLflow integration to log model versions and accuracy metrics.

4. Observability & UI Layer (The Monitor)

  • Metrics: Prometheus scraping API and DB performance.
  • Visualization: Grafana dashboards for system health.

Infrastructure

The entire stack is containerized using Docker and Docker Compose, connected via an isolated internal bridge network for data services, with exposed gateways for the API/UI.