Big Data Platform

Mediahuis wanted to set up a Data Platform where data was gathered from various advertising tools (Ad Server, SSPs) and advertising APIs. Given the size of this data (several GBs per day), a Big Data Platform was designed together with element61, set-up and configured with automated jobs to source, transform and prepare data for reporting.

The platform was build using Microsoft Azure using Azure PaaS components (Azure Kubernetes Services (k8s), Blob Storage, Data Lake Store) and open-source (Apache Airflow, Python, Dask). The project was set-up in Azure DevOps with full CI/CD pipelines, automated testing and infrastructure-as-code.