Pramiti Docs

Source Registry

Class-to-source-to-table mapping with industry starter kits

The source registry maps knowledge model concepts (Object Types) to their physical locations in databases. It also provides industry-specific starter kits with pre-built ontologies to accelerate onboarding.

How It Works

Source Mapping

Every Object Type in the knowledge model must map to a physical source:

Object Type: Customer
  └── Source: production_postgres
       └── Table: public.customers
            └── Columns: id, name, email, created_at, ...

This three-level mapping (class -> source -> table) allows the same business concept to exist in multiple data sources without naming conflicts. The source registry decouples business semantics from database vendor specifics.

Starter Kit Loader (starter_kit_loader.py)

The StarterKitLoader class provides pre-built industry ontologies:

loader = StarterKitLoader(vertical="saas", include_base=True)
loader.load_into_graph(graph_uri="workspace-1", oxigraph_endpoint="http://localhost:7878")
loader.load_verified_queries_to_ks(workspace_id=workspace_id)

Available verticals:

  • SaaS — 150 classes covering customers, subscriptions, revenue, billing, support
  • Retail — 150 classes covering products, orders, inventory, customers, promotions
  • Healthcare — 150 classes covering patients, encounters, claims, providers

Each starter kit includes:

  • OWL ontology (classes, properties, domain definitions)
  • Verified queries (curated SQL patterns)
  • Source mapping templates (customizable to your schema)

Starter Kit Matcher (starter_kit_matcher.py)

Automatically matches your database schema against starter kit patterns:

  1. Scans your database tables and columns
  2. Compares against starter kit patterns using name similarity and structure matching
  3. Proposes Object Type mappings with confidence scores
  4. Human data steward reviews and approves/rejects each mapping

Source Models (source_models.py)

SQLAlchemy models for the source registry in PostgreSQL:

  • DataSource — Registered data sources (connection details, type, status)
  • SourceMapping — Class-to-table mappings
  • ColumnMapping — Property-to-column mappings with type annotations

Architecture

The source registry sits between the knowledge model and the database connectors:

Knowledge Model (OWL)

Source Registry (PostgreSQL)

Database Connectors (PostgreSQL, Snowflake, BigQuery, Oracle, Athena)

The SQL validator uses the source registry to verify that generated SQL references valid tables and columns. The NLQ engine uses it to determine which data source to query.

Configuration

Data sources are registered via the REST API:

POST /api/v1/sources
{
  "name": "production_postgres",
  "type": "postgresql",
  "connection_string": "postgresql://user:pass@host:5432/db",
  "workspace_id": "ws-1"
}

Technical Details

  • Source mappings are cached in memory with TTL for performance
  • The starter kit patterns in starter_kit_patterns.py define canonical column patterns for each industry vertical
  • Trust tiers (trust_tier_loader.py) assign confidence levels to source mappings: steward_confirmed (highest), auto_resolved (lower), unverified (lowest)
  • Auto-resolved mappings are read-only (SI-1 safety invariant) — writes require steward confirmation
  • Join paths (join_path_loader.py) store validated foreign key relationships between tables

On this page