Merger Reconciler

pramiti-merger-reconciler matches and reconciles data models from different organizations during M&A integration. It identifies overlapping concepts, classifies conflicts, and suggests resolutions — with optional LLM-powered semantic similarity.

Install

pip install pramiti-merger-reconciler

Quick Start

from pramiti_merger_reconciler import ClassSpec, match_classes, classify, suggest
from pramiti_merger_reconciler.models import ClassConflict
 
# Define classes from two organizations
classes_a = [
    ClassSpec(class_name="Revenue", definition="Net revenue after discounts"),
    ClassSpec(class_name="Customer", definition="Active paying customer"),
]
classes_b = [
    ClassSpec(class_name="Revenue", definition="Gross revenue before discounts"),
    ClassSpec(class_name="Client", definition="Any customer with an account"),
]
 
# Match classes across ontologies
matches = match_classes(classes_a, classes_b)
 
# Classify and suggest resolutions
for m in matches:
    conflict_type = classify(m)
    proposal = suggest(ClassConflict(m, conflict_type))
    print(f"{m.class_a.class_name} ↔ {m.class_b.class_name}: {conflict_type}")
    print(f"  Suggestion: {proposal}")

The 4-Step Matching Algorithm

The match_classes() function uses a 4-step algorithm:

Exact name match (score=1.0) — Classes with identical names
Normalized name match (score=0.9) — Case-insensitive with suffix stripping (e.g., "CustomerData" matches "customer")
Semantic LLM similarity (score=0.5-0.85, threshold=0.7) — Optional LLM-powered semantic comparison, capped at max_similarity_pairs
Unmatched residuals — Classes that have no match in the other ontology

# Without LLM (fast, deterministic)
matches = match_classes(classes_a, classes_b)
 
# With LLM (more accurate for semantic similarity)
from my_adapter import MyLLMAdapter
matches = match_classes(classes_a, classes_b, llm=MyLLMAdapter(), max_similarity_pairs=200)

API Reference

ClassSpec

@dataclass
class ClassSpec:
    class_name: str        # Class name
    definition: str        # Business definition
    properties: list = []  # Optional list of properties

MatchResult

@dataclass
class MatchResult:
    class_a: ClassSpec
    class_b: ClassSpec
    match_type: str        # "exact", "normalized", "semantic", "unmatched"
    score: float           # 0.0 to 1.0

Conflict Classification (`classify()`)

Classifies the nature of a match conflict:

identical — Same name, same definition
name_match_definition_conflict — Same name but different definitions (e.g., "Revenue" means different things)
semantic_overlap — Different names but overlapping definitions
complementary — Related but distinct concepts that can coexist
unrelated — No meaningful relationship

Resolution Suggestions (`suggest()`)

Generates a resolution proposal for each conflict:

merge — Combine into a single definition
rename — Keep both with disambiguated names
choose_a / choose_b — Adopt one organization's definition
create_new — Create a new unified definition

ILLMAdapter (Interface)

class ILLMAdapter(ABC):
    def compute_similarity(self, text_a: str, text_b: str) -> float: ...

The NoOpLLMAdapter returns 0.0 for all comparisons, effectively disabling semantic matching.

Technical Details

Name normalization strips common suffixes: data, record, table, entity, type, class, model, info, detail
LLM call count is bounded by max_similarity_pairs to control costs
The algorithm is deterministic when no LLM is used — same inputs always produce same outputs
The ReportSummary model provides aggregate statistics for the reconciliation: total matches, conflicts by type, resolution suggestions

On this page