● Live JDE Data — Gold Layer

Unified RFQ
Intelligence Platform

Real-time visibility across all entities. Submit RFQs, detect conflicts, and route requests to the right team — powered by live JDE data through a Bronze / Silver / Gold medallion architecture.

Built by JP Castro · Senior Data Architect · johnpaulcastro@gmail.com

What is this?

A live, working demo of a modern data platform — the kind of system I design and build for companies that have critical data locked in legacy ERP systems.

This one pulls data from a simulated JD Edwards ERP, transforms it through a three-layer architecture (raw → cleaned → business-ready), and serves it to live dashboards, a customer self-service portal, and an e-commerce storefront. Everything you see is running in production on open-source tools.

The platform also includes a Master Data Management (MDM) layer that demonstrates what happens when multiple ERP systems coexist across a portfolio of companies. Five separate ERPs — each with their own schemas, naming conventions, and customer records — are extracted, normalized, and then matched using Splink, a probabilistic record linkage engine. The result is a set of golden customer records that unify duplicates like "Boeing Co.", "THE BOEING COMPANY", and "Boeing Defence UK Ltd" into a single entity, with consolidated sales visibility across all systems. This is the foundation any RFQ or pricing system would need to sit on top of.

📖 Quick terminology (if you're new to modern data stacks)
ERP / JD Edwards

Enterprise software companies use to run operations — sales orders, inventory, purchasing, AR/AP. JD Edwards (JDE) is Oracle's ERP for mid-to-large manufacturers and distributors.

Medallion Architecture

A three-layer pattern for cleaning and organizing data: Bronze (raw copy from source), Silver (cleaned and typed), Gold (business-ready aggregations).

dbt (data build tool)

Open-source tool that transforms raw data into analytics-ready tables using SQL and version control. Runs the Silver and Gold layers.

Apache Airflow

Open-source workflow scheduler. Runs the full data pipeline on a nightly schedule and handles retries and alerting.

RFQ

Request For Quote — a customer asking for pricing on parts. Distributors deal with these constantly.

Railway

Cloud platform hosting everything you see — the Postgres database, the API, the dashboard, the portal, and the shop.

MDM (Master Data Management)

The practice of creating a single, trusted view of key business entities (customers, vendors, items) across multiple source systems that may store the same data differently.

Splink

Open-source Python library for probabilistic record linkage. Uses techniques like Jaro-Winkler similarity and Fellegi-Sunter models to match records that refer to the same entity but have different names, formats, or typos.

Powered by PostgreSQL · dbt Core · Apache Airflow · Node.js · Next.js · Splink

johnpaulcastro@gmail.com