Split the mainframe migration into two pipelines — DMS for historical, Glue for ongoing
Context
The Enterprise Reinsurance migration needed to move 23 mainframe DB2-LUW tables to Aurora PostgreSQL once (the historical lift) and then keep transforming them continuously into a reporting layer feeding ~50 treaty and finance consumers at cadences from 15 minutes to weekly. The naive plan was to run everything through AWS DMS — same tool for the one-time load and the ongoing replication, fewer moving parts on the slide deck. But DMS's ongoing-replication model would have anchored the reporting layer to DMS forever, and the reporting layer needed transformation logic DMS isn't built for.
Decision
Run two pipelines. AWS DMS handles the one-time historical migration off the mainframe — purpose-built, then decommissioned. AWS Glue with PySpark handles ongoing ELT and the reporting layer — event-driven via S3 PutObject and EventBridge schedules, transforming and landing curated data in the Aurora reporting database. The reporting layer never inherits a DMS dependency.
Consequences
- Historical load shipped cleanly via DMS and was decommissioned — no long-tail operational burden from a tool used for a one-time job.
- The reporting pipeline is owned by Glue/PySpark end-to-end, with no DMS coupling to break later.
- Two operational tools to learn and monitor instead of one — the cost paid up front for the architectural clarity.
- Reporting cadences from 15 minutes to weekly all run off the same Glue pipeline, because the event substrate (S3 + EventBridge) was designed for the heterogeneous-cadence case from day one.