Building a self-served ETL pipeline for third-party data ingestion

1 · Skyscanner · April 2, 2019, 4:34 p.m.
We used Cookiecutter, AWS Batch and Glue to solve a tricky data problem — and you can tooOpen source tool Cookiecutter was a crucial component of the pipeline we built to ingest third-party data. Photo: cookie cutting of a different kind taking place in Lugano, SwitzerlandAs much as we’d like to generate all the data we need in-house (and we do generate a lot), there is often a need to import datasets from external sources, and make them available for querying. Examples of imported data include:...