Use 3 metadata-driven Glue jobs instead of ~69 per-table jobs
Context
The Enterprise Reinsurance migration had to move 23 mainframe DB2-LUW tables to Aurora PostgreSQL and stand up a reporting layer feeding ~50 treaty and finance consumers. The first sketch — the design every senior engineer would have drawn — was one Glue job per table, three stages each, for ~69 jobs total. Familiar shape, easy to assign, easy to defend. I spent close to two weeks staffing and sketching toward it before I noticed I'd said the sentence "yeah, it'll be a lot to operate, but that's the cost" four times in one week. The tables didn't differ in shape that mattered to the pipeline; the transformations were parameterizable.
Decision
Build the ETL as 3 generalized Glue jobs — extract, transform, load — driven by a metadata table that names each source, its per-source transformation parameters, and its consumer cadence. One pattern, every source. Scrapped the per-table sketch and rebuilt against the metadata vocabulary.
Consequences
- 23 tables migrated and continuously transformed through 3 jobs — fewer surfaces to test, version, and run on-call against; one bug fix, not 23.
- The metadata table became a first-class operational artifact — adding a new source means extending metadata rows, not writing new jobs.
- An unusual source whose transformation can't be expressed in the existing metadata vocabulary forces an extension to the schema, not a drop-in new job — the cost the per-table design wouldn't have charged.
- The two-week tax I paid defending the per-table design before changing my mind became the tripwire lesson: if you're justifying the cost of your own design more than once, you're not designing — you're defending.