Published on November 6, 2025

by Osledy Bazo

AI Agents for Mortgage Credit Evaluation.

From static rules to auditable decisions with RAG and tool-use. How AI agents can transform mortgage credit evaluation using RAG and tool-use for auditable and transparent decisions.

Project Summary

We designed and implemented an AI agent for mortgage credit evaluation that unifies scattered documents, deterministic calculations, and risk policy into an explainable and auditable flow.

The system normalizes payslips, statements, earnest money, and appraisals, calculates PTI/DTI/LTV and residual income with a deterministic engine, and applies policy as versioned data (policy-as-data).

Through RAG and tool-use, the agent justifies each ratio with citations to documents and rules, reducing reprocessing, improving consistency among analysts, and facilitating the defense of decisions before Risk and Audit.

The current problem in mortgage evaluation

Mortgage credit evaluation doesn't fail because of the PTI/DTI/LTV formula, but rather because of how each team arrives at those figures: inconsistent documents, criteria that change between segments, calculations with different assumptions, and little traceability to explain a decision to auditors or clients. The following analysis summarizes what hurts and what a modern solution needs based on AI agents + RAG + tool-use.

Documents and data: more useful context

A typical file mixes PDFs (payslips, statements), scanned images (rent receipts), spreadsheets (RETA, income books), earnest money contracts, appraisals, and internal forms. Without an organizational layer, the analyst:

Wastes time figuring out where the data is (e.g., is the net amount on page 2 or 3 of the payslip?).

Makes errors when copying/pasting (net vs. gross, euros vs. thousands).

Doesn't know if the document is recent enough to be valid (e.g., a payslip from 5 months ago).

What we need in practice

Is an early normalization that does three simple but decisive things:

Unified currency and units (e.g., everything in EUR, "month" as the standard period).

Clear periodization (e.g.: payroll_2025_08.pdf → month=2025-08, net=2510), so that recency and trends can be automatically validated.

Provenance metadata: each number carries its source (file, page, field), so that any calculation points to the origin in one click.

{
  "month": "2025-08",
  "net": 2510,
  "currency": "EUR",
  "source": "payroll_2025_08.pdf#page=1"
}

{
  "month": "2025-08",
  "incomes": 2580,
  "overdrafts": 0,
  "source": "extracto_bancoX_2025_08.pdf#page=2"
}

With this, a searchable index (RAG-type) no longer guesses where the data is; it cites it.

Key metrics but well-defined from the start

We talk about PTI, DTI, LTV and residual, but the differences appear in the details:

PTI = stressed mortgage payment / monthly net income.

The most common error is using the payment without stress (current rate) or dividing by gross income.

DTI (post) = (other debts + stressed payment) / income.

If the policy allows the mortgage to replace rent, rent is not added to the post. Otherwise, it is.

LTV = loan amount / min(price, appraisal).

Using only the appraisal or only the price skews the risk measurement.

Residual = income − (mortgage payment + other debts).

It is compared against a minimum living standard (e.g., 900 € + 300 €/dependent).

def annuity_payment(principal, annual_rate, years):
    r = annual_rate / 12
    n = years * 12
    if r <= 0:
        return principal / n
    return principal * (r * (1 + r)**n) / ((1 + r)**n - 1)

If one team calculates PTI with stress +3 pp and another without stress, the same case can go from APPROVED to CONDITIONAL. The solution must include a deterministic (unique) engine that fixes formulas and inputs, and leaves the agent the layer of "explaining and deciding" based on policy

Policy is not a PDF: it's policy-as-data (versionable and citable)

Rules change by segment (resident/non-resident), activity type (employee/self-employed), product (fixed/variable). In a PDF, nuances ("conservative threshold", "variable income") are interpreted differently by each analyst, and it's hard to know which version was applied.

Converting it to JSON solves three things:

Clarity: pti_max=0.35, ltv.primary_residence_max=0.80… without ambiguity.

Governance: versioning (v1.2, v1.3) and traceability of when it changed.

Automation: the agent reads that policy (doesn't remember it) and applies thresholds consistently.

{
  "affordability": {
    "pti_max": 0.35,
    "dti_total_max": 0.45,
    "residual_income_min": {"base": 900, "per_dependent": 300},
    "rent_replacement_allowed": true
  },
  "ltv": {
    "primary_residence_max": 0.80,
    "take_lower_of_price_or_appraisal": true
  },
  "rate_stress": {
    "apply": true,
    "buffer_pp": 3.0,
    "min_rate_after_stress": 0.05
  }
}

With policy-as-data, changing the LTV limit or stress buffer is a matter of editing a JSON (and it's audited).

Traceability: explaining the "why" with citations, not opinions

Evaluators and internal control don't settle for "CONDITIONAL": they ask for proof. The output must state:

Which rule was applied (e.g., affordability.pti_max, rate_stress.buffer_pp).

Which documents support the numbers (payslips from June–August, earnest money, appraisal).

What changes would make the case APPROVED (e.g., "reduce principal to ≤ €149,700" or "add co-applicant").

{
  "decision": "CONDICIONADO",
  "issues": ["PTI 42.1% > 35%", "DTI 46.9% > 45%", "LTV 83.7% > 80%"],
  "reason": "La cuota estresada es alta respecto al ingreso; el endeudamiento total y el LTV superan umbrales.",
  "conditions": ["Reducir principal a ≤ €149,700", "Entrada +€8,000 o co-solicitante"],
  "citations": {
    "policy": [
      "affordability.pti_max",
      "ltv.primary_residence_max",
      "rate_stress.buffer_pp"
    ],
    "case": [
      "payroll_2025_06.json",
      "payroll_2025_07.json",
      "payroll_2025_08.json",
      "property_appraisal.json",
      "property_purchase.json"
    ]
  }
}

This changes the conversation: it's no longer about opinions, it's about demonstrating.

Quality and recency: checklist that avoids reprocessing

Operational disagreements usually come from simple things:

A payslip outside recency (policy requires 3 months and one from 5 months ago arrives).

A net amount that doesn't match the deposit in the statement (tolerance, for example, ±5%).

A declared loan that doesn't appear in movements (or vice versa).

With a programmatic checklist, the system flags before deciding: "missing August payslip", "incomplete statement", "DTI doesn't consider revolving card".

Impact and what the solution requires

Pain	Impact	What do we need?
Inconsistent PDFs/images	Reprocessing, errors	Normalization (currency, period, source) + searchable index
Disparate calculations	Inconsistent decisions	Deterministic engine for metrics (single source)
Ambiguous policy	Variable criteria	Policy-as-data (versioned and citable JSON)
No explainability	Costly audit	JSON output with `reason`, `issues`, `conditions` and citations
No checklist	NIGO, long cycle	Automatic validators for recency and coherence
Limited scalability	Bottlenecks	Agents with tool-use and RAG (automate repetitive tasks)

Agents with tools (tool-use) + RAG

Tool 1: get_case_metrics: calculates raw metrics (inputs for PTI/DTI/LTV/residual) from normalized JSONs.

Tool 2: policy_qe: reads the JSON policy and returns thresholds with citations (which section applied).

Tool 3: case_qe: retrieves passages from documents (e.g., net in payslip or appraisal value) so the agent can cite.

Agent: orchestrates these tools and delivers a structured response: decision, issues, reason, conditions, citations.

Deterministic engine (in parallel): ensures numerical consistency and serves for A/B tests with policy changes.

This distribution prevents the LLM from inventing numbers: calculates with tools, reads thresholds from policy and cites documents.

Mini-case in 3 lines

"Laura" requests €180k. With stress +3 bp, the stressed payment gives PTI 42%, post DTI 47% and LTV 84% — all above policy. The system issues CONDITIONAL with concrete actions: reduce principal to ~€149.7k or have co-applicant; and cites both the rules (PTI/DTI/LTV, stress) and the payslips and appraisal used.

Why use AI agents now?

Mortgage evaluation requires three things at once: numerical consistency, ability to cite sources, and speed to adapt to policy changes. Current agents, language models with RAG and tool use, fit because they don't replace calculations or rules: they orchestrate them. They consult policy in real time, call functions to calculate, and retrieve exact passages from documents. The result is an explainable and repeatable decision.

What has changed

Solid tool use. The model invokes functions by name and returns structured outputs. When it reports a PTI, that number comes from your calculation function.

Reliable RAG. Indexing payslips, statements, earnest money or appraisal allows retrieving and citing the fragment that supports each figure.

Policy as data. Thresholds live in versioned JSON. Changing maximum PTI, LTV or stress buffer is editing data; the agent reads them on the fly.

The agent as orchestrator

The agent doesn't guess figures. It requests thresholds from policy, calculates with functions, and retrieves evidence. Typical flow:

Extracts PTI, DTI, LTV, stress and applicable definitions from policy, with citation.

Calculates income, debts, stressed payment and LTV base with the deterministic engine.

Retrieves passages that support figures in the case documents.

Composes a JSON with decision, issues, explanation, conditions and citations.

This distribution separates responsibilities: deterministic calculation for numbers, policy as source of limits, and the model to integrate and explain.

{
  "affordability": {
    "pti_max": 0.35,
    "dti_total_max": 0.45,
    "rent_replacement_allowed": true
  },
  "ltv": {
    "primary_residence_max": 0.80,
    "take_lower_of_price_or_appraisal": true
  },
  "rate_stress": {
    "apply": true,
    "buffer_pp": 3.0,
    "min_rate_after_stress": 0.05
  }
}

Tangible benefits

Direct advantages: clarity, version-based traceability, and agility to reflect threshold changes without touching the model.

Consistency. Same input, same output. Discrepancies between spreadsheets are eliminated.

Explainability. Each ratio includes citation to policy and document, useful for audit and to communicate conditions to the client.

Fewer reprocesses. Automatic validators for recency and coherence reduce NIGO and cycle times.

Simple evolution. Adding a segment or adjusting thresholds equals extending the JSON and, if needed, adjusting the calculation with localized changes.

Orientation mini-snippet

# 1) Cálculo determinista (fuente de verdad de métricas)
def get_case_metrics(): ...

# 2) Lectura de política (RAG sobre JSON versionado)
policy_qe = VectorStoreIndex.from_documents([Document(text=open("policy.json").read())]).as_query_engine()

# 3) Consulta de documentos del caso (para citas)
case_qe = VectorStoreIndex.from_documents(load_case_docs()).as_query_engine()

# 4) El agente orquesta: política → métricas → evidencias → JSON explicable

Reference architecture

The idea is to clearly separate four responsibilities: data, rules, calculation, and decision. With this design, you can change policy without touching code, track each figure to its document, and explain why a request comes out approved, conditional, or not approved.

Overview

                         ┌─────────────────────────────┐
                         │   Policy Registry           │
                         │     (versioned JSON)        │
                         └──────────────┬──────────────┘
                                        │ read/query
                                        ▼
┌─────────────────────────────┐   ┌───────────────────────────-──┐
│   Ingestion and             │   │     Semantic Indices        │
│   Normalization             │   │  • Case (documents)          │
│   (PDF/IMG/CSV → JSON)      │   │  • Policy (thresholds)      │
│   • currency, period, source │   │                             │
└──────────────┬──────────────┘   └───────────────┬──────────────┘
               │                                  │
               ▼                                  ▼
       ┌───────────────-─┐                ┌───────────────────────┐
       │ get_case_metrics│                │ policy_qe / case_qe   │
       │ Deterministic   │                │ RAG: retrieves        │
       │ Calculation     │                │ passages and          │
       │                 │                │ thresholds with cites │
       └────────┬────────┘                └────────────┬──────────┘
                │                                      │
                └──────────────┬───────────────────────┘
                               ▼
                      ┌───────────────────┐
                      │  AI Agent         │
                      │  Orchestrates     │
                      │  tools and        │
                      │  composes output  │
                      └────────┬──────────┘
                               ▼
                 ┌─────────────────────────────┐
                 │ { decision, issues, reason, │
                 │   conditions, citations }   │
                 └─────────────┬───────────────┘
                               ▼
         Analyst Panel • CRM/Core • Audit Log

Policy lives as versioned data.

Case documents are normalized into JSON with currency, period, and source reference.

Semantic indices allow querying and retrieving passages that can be cited.

The calculation engine applies exact formulas.

The agent uses these pieces to decide and explain, not to invent numbers.

Data layer: just enough to work with confidence

Before thinking about models, we ensure that each number has context. This is achieved by adding minimal metadata in the transformation to JSON:

Period or date (for example, 2025-08).

Amount and currency.

doc_type and applicant_id.

Source reference (file, page, selector).

// payroll_2025_08.json
{
  "doc_type": "payroll",
  "month": "2025-08",
  "net": 2510,
  "currency": "EUR",
  "source_file": "payroll_2025_08.pdf",
  "page": 1
}

// property_appraisal.json
{
  "doc_type": "property_appraisal",
  "appraised_value": 225000,
  "date": "2025-08-20",
  "source_file": "tasacion_0820.pdf",
  "page": 3
}

With that foundation, any figure the system uses can be cited in detail.

Indices and retrieval

Case index: contains all normalized documents. Used to retrieve fragments that justify income, debts, price or appraisal.

Policy index: contains the current rules JSON (and previous versions, if desired). Used to retrieve thresholds and definitions.

from llama_index.core import Document, VectorStoreIndex

policy_doc = Document(text=open("policy.json").read(), metadata={"doc_type": "policy", "version": "v1"})

policy_qe = VectorStoreIndex.from_documents([policy_doc]).as_query_engine(similarity_top_k=3)

Deterministic calculation

This layer is simple and critical. It calculates the payment with annuity, PTI, DTI, LTV and residual. Applies the rate stress that the policy marks. In this way, the same case always produces the same metrics.

def annuity_payment(principal, annual_rate, years):
    r = annual_rate / 12
    n = years * 12
    if r <= 0:
        return principal / n
    return principal * (r * (1 + r)**n) / ((1 + r)**n - 1)

def get_case_metrics(case_dir) -> dict:
    # 1) Lee JSON normalizados (nóminas, préstamos, arras, tasación, solicitud)
    # 2) Deriva ingreso mensual, deudas, importe/years/rate y min(precio, tasación)
    # 3) Calcula cuota estresada según política actual
    # 4) Devuelve un diccionario de métricas
    return metrics

case_docs = [Document(text=json_to_markdown(p), metadata=meta_of(p)) for p in load_case_jsons()]

case_qe   = VectorStoreIndex.from_documents(case_docs).as_query_engine(similarity_top_k=5)

Agent tools

get_case_metrics: delivers raw metrics.

policy_qe: brings thresholds and definitions with citable fragments from the policy.

case_qe: brings passages that justify the case numbers.

from llama_index.core.tools import FunctionTool, QueryEngineTool

metrics_tool = FunctionTool.from_defaults(fn=lambda: get_case_metrics(CASE_DIR),name="get_case_metrics", description="Métricas del caso en JSON.")

policy_tool  = QueryEngineTool.from_defaults(query_engine=policy_qe, name="policy_qe", description="Umbrales y reglas desde la política.")

case_tool = QueryEngineTool.from_defaults(query_engine=case_qe, name="case_qe", description="Pasajes citables de los documentos del caso.")

Decision agent

The agent always follows the same mental sequence:

Reads thresholds in the policy.

Gets the calculated metrics.

Retrieves evidence in documents.

Delivers a structured response with decision, issues, explanation, conditions and citations.

Practical suggestion: include in the prompt the verdict mapping criterion to be consistent with Risk. For example, approved without violations; conditional with one or two violations or with clear solutions; not approved when there are several critical violations or magnitudes well above the limits.

from llama_index.agent.openai import OpenAIAgent

SYSTEM = """

Eres analista hipotecario.

Usa policy_qe para umbrales, get_case_metrics para cifras y case_qe para evidencia.

Devuelve solo JSON con:

{ "decision": "APTO|CONDICIONADO|NO_APTO",

  "issues": ["..."],

  "reason": "...",

  "conditions": ["..."],

  "citations": { "policy": ["..."], "case": ["..."] } }

"""

agent = OpenAIAgent.from_tools(

    tools=[metrics_tool, policy_tool, case_tool],

    system_prompt=SYSTEM,

    verbose=True

)

verdict = agent.chat("Evalúa el caso y devuelve el JSON solicitado.")

Output and audit: explanation comes standard

The agent's output is ready to integrate into the analyst's panel. It includes what Risk and Audit ask most frequently: which threshold was applied, which document supports the figure, and what changes would make the case approved.

To ensure format and avoid surprises, validate the JSON with a schema:

from pydantic import BaseModel

class Verdict(BaseModel):

    decision: str

    issues: list[str]

    reason: str

    conditions: list[str]

    citations: dict

validated = Verdict.model_validate_json(verdict.response)

Additionally, it records the applied policy version and the logs of invoked tools. It is the Ariadne's thread of each decision.

An end-to-end example in 20 seconds

The analyst clicks evaluate.

The agent queries the current policy: PTI 35, DTI 45, LTV 80, stress plus three points with minimum 5.

The engine calculates income, debts and stressed payment and returns PTI, DTI and LTV.

The agent retrieves passages from payrolls and appraisal that justify those figures.

The response arrives with decision, reasons, concrete conditions and precise citations.

With this scheme, the solution is predictable for Risk, comfortable for the analyst and defensible before Audit. And if tomorrow the LTV limit or the stress buffer changes, just update the policy JSON: everything else adjusts automatically.

Key metrics and rules

In practice, mortgage eligibility is decided with four pieces: the stressed payment, PTI, DTI, LTV and residual income. What matters is not just the formula, but when to apply it, with what inputs and how to justify each number. Below we explain them in an orderly way and with a mini-example that puts them in context.

Stressed payment: the starting point

Before calculating ratios, a prudent payment must be set. For this, the rate stress defined by the policy is applied —for example, adding three percentage points to the nominal rate, with a minimum floor of five percent— and the annuity formula is used.

Rate stress:

stressed_rate = max(nominal_rate + buffer_pp/100, min_rate)

Monthly payment (annuity):

payment = P * [ r * (1+r)^n ] / [ (1+r)^n − 1 ],

where P is the principal, r is the monthly rate and n is the number of payments.

This payment is what conditions the rest: if it is high relative to income, PTI and DTI rise; if the principal decreases or income increases, both ratios improve immediately.

def stressed_rate(nominal, buffer_pp, min_rate):
    return max(nominal + buffer_pp/100, min_rate)

def annuity_payment(P, annual_rate, years):
    r, n = annual_rate/12, years*12
    return P/n if r <= 0 else P * (r*(1+r)**n)/((1+r)**n - 1)

PTI: housing effort

PTI measures how much of monthly income is allocated to the stressed payment of the mortgage.

PTI = stressed_payment / net_monthly_income

A typical limit is 35%. A PTI above this indicates that the housing burden is high for the declared income level.

If PTI exceeds the threshold, there are three levers: reduce the loan amount, extend the term (with caution) or accredit more stable income, for example with a co-applicant.

DTI: total debt

DTI integrates the entire financial burden. It is calculated at two moments:

Current DTI: (other_debts + rent) / income

Post DTI: (other_debts + stressed_payment) / income

If the policy allows rent substitution (first home purchases), rent is not added in post DTI. The usual limit is 45%. This ratio captures cases where PTI seems acceptable but the client already carries consumer loans or credit cards.

LTV: prudence on property value

LTV compares the loan amount with the reference value of the property, taking the lower between purchase price and appraisal.

LTV = principal / min(price, appraisal)

For first homes the limit is usually 80%. A high LTV indicates little down payment cushion; it is typical to resolve it with more contribution or adjusting the amount.

Residual income: safety margin

What remains after paying mortgage and debts each month:

residual = income − (stressed_payment + other_debts)

It is compared against a minimum living standard set in policy —for example, 900 euros base plus a supplement per dependent—. It provides a complementary view of payment capacity to PTI and DTI.

Conditions that turn a no into a yes

Beyond the verdict, the system must propose clear paths to achieve approval. For this it is useful to calculate:

Maximum principal by PTI: allowed payment = PTI_max * income; the annuity is inverted to recover the compatible principal.

Maximum principal by DTI: allowed payment = DTI_max * income − other_debts.

Maximum principal by LTV: LTV_max * min(price, appraisal).

Necessary income to meet PTI or DTI with the current amount.

The practical recommendation is to take the minimum of those three maximum principals and express it in simple conditions: reduce amount to X, contribute Y down payment, or accredit Z more euros per month.

Simplified example

Typical employee case with rounded data:

Net monthly income: 2,510 €

Requested amount: 180,000 € for 30 years

Nominal rate: 2.8%; stress: +3 pp with minimum 5%

Other debts: 120 €/month; current rent: 900 €/month

Price: 215,000 €; appraisal: 225,000 €

Policy thresholds: PTI 35%, DTI 45%, LTV 80%

Key calculations:

Stressed rate: 5.0%

Stressed payment: 1,056 €/month

PTI: 1,056 / 2,510 = 42.1% → above 35%

Post DTI (with rent substitution): (120 + 1,056) / 2,510 = 46.9% → above 45%

LTV: 180,000 / 215,000 = 83.7% → above 80%

Advice:

Reducing principal to ≈ 149,700 € meets PTI and pulls DTI into the green zone. If the amount is maintained, an additional down payment ~8,000 € corrects LTV, but PTI would still be high; it would be necessary to add income (≈ 508 €/month) or incorporate a co-applicant.

Implementation (minimal)

The goal is to build an MVP that converts scattered documents into an explainable decision. Below you have the essential steps, with code snippets ready to paste and adapt.

Indices for policy and case

def build_indexes(case_dir: Path, policy_path: Path):
    # índice de política
    policy_text = Path(policy_path).read_text()
    policy_doc  = Document(text=policy_text, metadata={"doc_type": "policy", "version": "v1"})
    policy_index = VectorStoreIndex.from_documents([policy_doc])
    policy_qe = policy_index.as_query_engine(similarity_top_k=3)
    
    # índice del caso
    case_docs = load_case_docs(case_dir)
    case_index = VectorStoreIndex.from_documents(case_docs)
    case_qe = case_index.as_query_engine(similarity_top_k=5)
    
    return policy_qe, case_qe

Deterministic metrics engine

Centralize the formulas here. It is the single source of numerical truth.

def stressed_rate(nominal: float, buffer_pp: float, min_rate: float) -> float:
    return max(nominal + buffer_pp/100.0, min_rate)

def annuity_payment(P: float, annual_rate: float, years: int) -> float:
    r, n = annual_rate/12.0, years*12
    if n <= 0: return 0.0
    if r <= 0: return P / n
    return P * (r*(1+r)**n) / ((1+r)**n - 1)

def load_policy(policy_path: Path) -> dict:
    return json.loads(Path(policy_path).read_text())

def get_case_metrics(case_dir: Path, policy: dict) -> dict:
    # carga entradas esenciales
    req   = json.loads((case_dir/"mortgage_request.json").read_text())
    price = json.loads((case_dir/"property_purchase.json").read_text())["price"]
    appr  = json.loads((case_dir/"property_appraisal.json").read_text())["appraised_value"]
    debts = json.loads((case_dir/"prior_loans.json").read_text())["other_debt_monthly"]
    rent  = json.loads((case_dir/"rent_receipts.json").read_text())["current_rent"]
    
    # ingreso mensual: ejemplo con nóminas de 3 meses
    p6 = json.loads((case_dir/"payroll_2025_06.json").read_text())["net"]
    p7 = json.loads((case_dir/"payroll_2025_07.json").read_text())["net"]
    p8 = json.loads((case_dir/"payroll_2025_08.json").read_text())["net"]
    income = (p6 + p7 + p8) / 3
    
    # stress y cuota
    stress = policy["rate_stress"]
    rate_stressed = stressed_rate(req["apr_hint"], stress["buffer_pp"], stress["min_rate_after_stress"])
    pay_stressed  = annuity_payment(req["amount"], rate_stressed, req["years"])
    
    # base de LTV
    base_value = min(price, appr)
    
    # ratios principales
    pti = pay_stressed / income
    
    # si la política permite sustitución de alquiler no lo sumes en post
    rent_replace = policy["affordability"].get("rent_replacement_allowed", True)
    dti_post = (debts + pay_stressed) / income
    dti_current = (debts + rent) / income
    ltv = req["amount"] / base_value
    
    residual_min = policy["affordability"].get("residual_income_min", {}).get("base", 900)
    residual = income - (debts + pay_stressed)
    
    return {
        "income": income,
        "other_debt": debts,
        "rent": rent,
        "loan": req,
        "price": price,
        "appraised": appr,
        "base_value": base_value,
        "rate_stressed": rate_stressed,
        "pay_stressed": pay_stressed,
        "pti": pti,
        "dti_current": dti_current,
        "dti_post": dti_post,
        "ltv": ltv,
        "residual": residual,
        "residual_min": residual_min,
        "rent_replacement_allowed": rent_replace
    }

Agent tools

Three tools are enough: metrics, policy and case documents.

def make_metrics_tool(case_dir: Path, policy_path: Path):
    def _fn():
        pol = load_policy(policy_path)
        return get_case_metrics(case_dir, pol)
    return FunctionTool.from_defaults(
        fn=_fn, name="get_case_metrics",
        description="Métricas crudas calculadas del caso en JSON."
    )

def make_policy_tool(policy_qe):
    return QueryEngineTool.from_defaults(
        query_engine=policy_qe, name="policy_qe",
        description="Consulta de umbrales y definiciones desde la política vigente."
    )

def make_case_tool(case_qe):
    return QueryEngineTool.from_defaults(
        query_engine=case_qe, name="case_qe",
        description="Recupera pasajes de documentos del caso para citarlos."
    )

Decision agent with structured output

Include in the system the output format and verdict criterion. Avoid free responses that are difficult to integrate.

SYSTEM = """

Eres analista hipotecario.

1) Usa policy_qe para obtener umbrales (PTI, DTI, LTV, stress, residual) y cita la sección usada.

2) Usa get_case_metrics para cifras calculadas.

3) Usa case_qe para traer pasajes que justifiquen números de ingreso, precio y tasación.

4) Devuelve JSON con:

{ "decision": "APTO|CONDICIONADO|NO_APTO",

  "issues": ["..."],

  "reason": "...",

  "conditions": ["..."],

  "citations": { "policy": ["..."], "case": ["..."] } }

Reglas de dictamen:

- APTO: ninguna violación

- CONDICIONADO: 1–2 violaciones o ajustes viables claros

- NO_APTO: 3 o más violaciones críticas o magnitudes muy por encima del límite

"""

def build_agent(case_dir: Path, policy_path: Path):
    policy_qe, case_qe = build_indexes(case_dir, policy_path)
    tools = [
        make_metrics_tool(case_dir, policy_path),
        make_policy_tool(policy_qe),
        make_case_tool(case_qe),
    ]
    return OpenAIAgent.from_tools(tools=tools, system_prompt=SYSTEM, verbose=True)

if __name__ == "__main__":
    case_dir = Path("data/case_EMP_001")
    policy_path = Path("data/policy.json")
    agent = build_agent(case_dir, policy_path)
    verdict = agent.chat("Evalúa el caso y devuelve el JSON solicitado.")
    print(getattr(verdict, "response", verdict))

Output validation and error control

Validate the agent's JSON so that backend and panel consume it without surprises.

from pydantic import BaseModel, Field, ValidationError

class Verdict(BaseModel):
    decision: str
    issues: list[str]
    reason: str
    conditions: list[str]
    citations: dict

try:
    parsed = Verdict.model_validate_json(getattr(verdict, "response", "{}"))
except ValidationError as e:
    # registra el error y aplica fallback si procede
    print("Salida del agente inválida:", e)

Additional best practices:

Limit output tokens and require the JSON block as the only response.

Include applied policy version and document references in logs.

Retry once if validation fails, maintaining the same context.

Security, compliance and quality

This section addresses how to protect data, govern changes and ensure consistent decisions. The idea is simple: treat policy as versioned data, minimize PII exposure and audit every step the agent takes with tools.

Personal data: just enough, well protected

Operational principles

Minimization. Only load what is necessary to calculate income, debts, price, appraisal and recency.

Separation. Store PII and risk features in different stores; work with identifiers and references.

Recency and retention. Validate validity and purge information outside the window agreed with Risk and Legal.

Access. Role-based control and access logging; encryption in transit and at rest.

Data	Use in decision	Reasonable retention	Observations
ID and profile	KYC, segmentation	According to KYC policy	Mask in logs
Payrolls or RETA	Income	12–24 months	Store monthly net and source
Bank statements	Behavior and debts	3–12 months	Derive signals; do not store full text
Deposit, purchase, appraisal	LTV and value	Until closing	Reference version and date
Previous loans	Current and post DTI	Loan validity	Normalize to monthly

Policy governance and changes

Policy in JSON with clear semantics and versioning. Header example:

{
  "policy_id": "mortgage_es_v1.3",
  "effective_date": "2025-08-01",
  "affordability": { "pti_max": 0.35, "dti_total_max": 0.45 },
  "ltv": { "primary_residence_max": 0.80 }
}

Change control. Proposal, Risk review, approval, deployment with version tag.

Traceability. Each evaluation stores the applied policy version.

Technical guardrails for the agent

Whitelist of tools. The agent can only invoke get_case_metrics, policy_qe and case_qe.

Mandatory output schema. Validate that the response meets the expected JSON; reject if it does not fit.

Context limit. Index and retrieve only what is necessary, avoiding the LLM seeing PII without reason.

Defense against document injection. Filter and sanitize input text; never execute instructions embedded in PDFs or notes.

Controlled retries. A single retry if schema validation fails; log both attempts.

Audit and traceability

What to register per case

Applied policy version and policy file hash.

Tools invoked by the agent, input parameters and cited fragments.

Source document identifiers and their pages or selectors.

Deterministic calculated numerical values and final generated decision.

Reproducibility

Preserve minimal artifacts: normalized inputs, policy version and deterministic engine code version.

Possibility to re-execute a historical case with the policy from that date.

Quality: tests and metrics

Automated tests

Numerical sanity. Payment, PTI, DTI, LTV and residual with known values.

Edge cases. Stress by minimum floor, rent substitution activated or not, LTV with appraisal lower than price.

Verdict rules. Mapping of violations to approved, conditional and not approved.

Continuous evaluation

Regression case bank by segment. Execute after each policy or model change.

Operational metrics. Decision time, NIGO rate, percentage of conditional decisions that become approved after adjustments.

Deterministic–agent alignment. Alerts if the agent's decision differs from the deterministic one outside an agreed tolerance.

Equity

Avoid sensitive attributes in input data and derived features.

Sensitivity audits with synthetic data to detect unintended biases in explanation or proposed conditions.

Functional evolution

Scenario simulator. Allows varying amount, term or down payment and see how PTI, DTI and LTV change. Useful for client conversation and analyst training.

Multi-applicant. Combine income, debts and specific policies for couples or co-applicants. Adjust minimum tenure rules and history per applicant.

Advanced extraction. Integrate OCR and parsers for bank PDFs and payrolls, with coherence validators between documents. Reduce manual normalization time.

Banking behavior. Derive signals from statements: overdrafts, revolving card usage, percentage of cash income. Add thresholds and alerts in policy.

Credit bureau and external sources. Connect with delinquency files and identity verification. Register evidence as another citable document.

Extended explainability. Add scores per metric and a richer client-oriented explanation, without exposing unnecessary PII.

Conclusion: architecture and benefits

Mortgage evaluation is not just calculating ratios. It is coordinating heterogeneous documents, applying changing policies and explaining decisions rigorously. The proposed architecture separates those responsibilities so the system is predictable for Risk, comfortable for the analyst and defensible before Audit.

Solves:

Policy as data: clear and versioned thresholds, ready to cite and adjust without touching code.

Deterministic calculation: a single source of truth for stressed payment, PTI, DTI, LTV and residual.

RAG over documents: recoverable evidence in detail with references to origin.

Agent with tools: orchestrates the pieces, contrasts against policy and delivers a structured output with decision, issues, explanation, conditions and citations.

Expected results

Consistency: same input, same output, regardless of analyst.

Less NIGO and reprocessing: recency and coherence checklist before deciding.

Shorter cycles: less manual search and less back and forth with the client.

Immediate explainability: each number points to its rule and its document.

Agility: PTI, DTI or LTV changes are applied by editing the policy JSON.

SOPs and daily operations

If you already have SOPs for the mortgage process, this solution reinforces them and makes them executable. The idea is that each SOP step has a direct translation into tools, validators and records.

How we map SOPs to the architecture

SOP for intake and completeness → programmatic checklist of documents and recency; case status visible before calculating anything.

SOP for income verification → retrieval of payrolls or RETA with RAG, coherence rules between payroll and statement, tolerances defined in policy.

SOP for policy application → policy-as-data; the agent reads current thresholds and cites them in the decision.

SOP for exceptions → controlled override flow: who requests the exception, for which metric, what evidence supports it and who approves it.

SOP for policy changes → JSON versioning, Risk and audit review, regression tests and canary deployment.

SOP for audit → logging of invoked tools, cited fragments, policy version and deterministic values used.

SOP for incidents and rollback → possibility to fix policy and model by version, revert changes and re-execute historical cases.

SOP for training → bank of labeled cases, scenario simulator and agent-generated explanations ready for coaching.

Example of operational SOP expressed as data

This allows the panel and the agent to follow the same script consistently:

sop_id: mortgage_underwriting_es_v1
effective_date: 2025-09-01
steps:
  - name: Validar completitud y recencia
    checks:
      - require: payroll_last_3m
      - require: bank_statements_last_3m
      - require: property_appraisal <= 6m
    on_fail: request_missing_docs

  - name: Calcular métricas deterministas
    tool: get_case_metrics
    outputs: [pti, dti_post, ltv, residual]

  - name: Aplicar política
    tool: policy_qe
    compare:
      - pti <= affordability.pti_max
      - dti_post <= affordability.dti_total_max
      - ltv <= ltv.primary_residence_max

  - name: Evidencia
    tool: case_qe
    cite:
      - payroll_months: [2025-06, 2025-07, 2025-08]
      - appraisal: latest
      - purchase_price: current

  - name: Dictamen y condiciones
    decision_rules:
      - if: violations == 0
        decision: APTO
      - if: violations in [1,2]
        decision: CONDICIONADO
      - if: violations >= 3
        decision: NO_APTO
    include: issues, reason, conditions, citations

Let’s build together

We combine experience and innovation to take your project to the next level.