Continuous Integration & Deployment
What is Continuous Integration?
Continuous Integration - also called CI - is the process of collaborative code development where various developers or Data Scientists work together and where their individual work is continuously integrated into one code version. It means that rather than merging individual work monthly or at moment of release we do this continuously (in practice: daily or everytime a code edit is made).
Practically this is done by code versioning and branching using tools such as Git. Using these version controle systems (VCS-es) a developer can branch from the master code, apply his changes and commit his change back to the master branch.
What is Continuous Deployment?
Continuous Deployment - also called CD - is the process of continuously testing code and releasing it into production. Rather than doing manual deploys/releases we update our production application continuously (in practice: multiple times a day).
Practically we automate testing and deployment and using our DevOps tooling a code commit can be instantly integrated, tested and released in production.
Figure: Steps in code development and deployment - also in Data Platform development
Why do we want to do CI/CD?
It is a dark secret that in many organizations deploying a new release of code into production takes a huge amount of effort. CI/CD brings continuity where quality checks and thus bugs are checked frequently (i.e. continuously).
As a result, CI/CD brings following benefits:
- Updates can be released faster
Developers can make changes and deploy fast resulting in the ability to react better to market changes and business/users needs.
- Higher productivity
Teams spend less time on code debugging and releases. They spend more time on actual AI improvement or other value-adding work.
Work is automated making everyone less dependent on manuals tasks. It saves time and costs.
What tools exists?
Most used tools for running CI/CD are CircleCI, Bamboo, Jenkins (open-source) and TeamCity. At element61 we also cheer for Azure DevOps which is a general DevOps tool providing CI/CD directly to Microsoft Azure Cloud Platform.
CI/CD is mainstream for developers yet new for a lot of BI-ers and Data Engineers. As element61 we believe that these worlds will merge more and more and that applying best-practices of development into Data Engineering is a must.
All our Data Engineers are trained with knowledge on setting up CI/CD.
Read more on our vision for a Modern Data Platform
Contact us to get started with CI/CD or to get a training on using CI/CD for (Big) Data Platform engineering