Unified RFQ
Intelligence Platform
Real-time visibility across all entities. Submit RFQs, detect conflicts, and route requests to the right team — powered by live JDE data through a Bronze / Silver / Gold medallion architecture.
Built by JP Castro · Senior Data Architect · johnpaulcastro@gmail.com
What is this?▾
A live, working demo of a modern data platform — the kind of system I design and build for companies that have critical data locked in legacy ERP systems.
This one pulls data from a simulated JD Edwards ERP, transforms it through a three-layer architecture (raw → cleaned → business-ready), and serves it to live dashboards, a customer self-service portal, and an e-commerce storefront. Everything you see is running in production on open-source tools.
The platform also includes a Master Data Management (MDM) layer that demonstrates what happens when multiple ERP systems coexist across a portfolio of companies. Five separate ERPs — each with their own schemas, naming conventions, and customer records — are extracted, normalized, and then matched using Splink, a probabilistic record linkage engine. The result is a set of golden customer records that unify duplicates like "Boeing Co.", "THE BOEING COMPANY", and "Boeing Defence UK Ltd" into a single entity, with consolidated sales visibility across all systems. This is the foundation any RFQ or pricing system would need to sit on top of.
📖 Quick terminology (if you're new to modern data stacks)▾
Enterprise software companies use to run operations — sales orders, inventory, purchasing, AR/AP. JD Edwards (JDE) is Oracle's ERP for mid-to-large manufacturers and distributors.
A three-layer pattern for cleaning and organizing data: Bronze (raw copy from source), Silver (cleaned and typed), Gold (business-ready aggregations).
Open-source tool that transforms raw data into analytics-ready tables using SQL and version control. Runs the Silver and Gold layers.
Open-source workflow scheduler. Runs the full data pipeline on a nightly schedule and handles retries and alerting.
Request For Quote — a customer asking for pricing on parts. Distributors deal with these constantly.
Cloud platform hosting everything you see — the Postgres database, the API, the dashboard, the portal, and the shop.
The practice of creating a single, trusted view of key business entities (customers, vendors, items) across multiple source systems that may store the same data differently.
Open-source Python library for probabilistic record linkage. Uses techniques like Jaro-Winkler similarity and Fellegi-Sunter models to match records that refer to the same entity but have different names, formats, or typos.
Medallion Architecture
Bronze → Silver → Gold data platform built on open source
NEWDatabricks Lakehouse
Lakeflow DLT pipeline, Unity Catalog RBAC, AI/BI dashboards on JDE aerospace data
My Resume
Senior Data Architect — JP Castro
Open Source Stack
PostgreSQL · dbt Core · Airflow · Node.js · Next.js · Splink
Operations Overview
Live KPIs across revenue, AR, inventory, and purchasing
Customer Portal
Self-service portal for customers
E-Commerce
Online ordering and catalog
MDM Entity Resolution
Probabilistic customer matching across 5 ERPs using Splink
MDM Consolidated Sales
Unified sales view linked to golden customer records
MDM Architecture
Extract → Bronze → Silver → Splink → Golden Record pipeline
Powered by PostgreSQL · dbt Core · Apache Airflow · Node.js · Next.js · Splink
johnpaulcastro@gmail.com