What is Continuous Integration
Continuous Integration - also called CI - is the process of collaborative code development where various developers or Data Scientists work together and where their individual work is continuously integrated into one code version. It means that rather than merging individual work monthly or at the moment of release we do this continuously (in practice: daily or every time a code edit is made).
Practically this is done by code versioning and branching using tools such as Git. Using these version control systems (VCS) a developer can branch from the master code, apply his changes, and commit his change back to the master branch.
What is Continuous Deployment
Continuous Deployment - also called CD - is the process of continuously testing code and releasing it into production. Rather than doing manual deploys/releases, we update our production application continuously (in practice: multiple times a day).
Practically we automate testing and deployment and using our DevOps tooling a code commit can be instantly integrated, tested, and released in production.
Figure: Steps in code development and deployment - also in Data Platform development
Why do we want to do CI/CD
It is a dark secret that in many organizations deploying a new release of code into production takes a huge amount of effort. CI/CD brings continuity where quality checks and thus bugs are checked frequently (i.e. continuously).
As a result, CI/CD brings the following benefits:
- Updates can be released faster
Developers can make changes and deploy fast resulting in the ability to react better to market changes and business/user needs. - Higher productivity
Teams spend less time on code debugging and releases. They spend more time on actual AI improvement or other value-adding work. - Sustainability
Work is automated making everyone less dependent on manual tasks. It saves time and costs.
What tools exist
The most used tools for running CI/CD are CircleCI, Bamboo, Jenkins (open-source), and TeamCity. At element61 we also cheer for Azure DevOps which is a general DevOps tool providing CI/CD directly to Microsoft Azure Cloud Platform.
CI/CD is mainstream for developers yet new for a lot of BI-ers and Data Engineers. As element61 we believe that these worlds will merge more and more and that applying best practices of development into Data Engineering is a must.
All our Data Engineers are trained with knowledge of setting up CI/CD.
Read more on our vision for a Modern Data Platform
Interested to know more
Join our 1-day (paid) course on Azure DevOps and learn all best practices.
Contact us to get started with CI/CD or to get training on using CI/CD for (Big) Data Platform engineering