Knowledge base

Search
Why? Since data pipelines are far behind the efficiency of the software industry, tools are needed to bring the design up to speed and to guarantee data quality. This optimisation and performance improvement can happen with tools such as dbt and...
Kyana Bosschaerts
What is dbt? dbt is an open-source transformation tool that aims to simplify the work of the analytic engineer in the data pipeline workflow. It specifically implements only the T in the ETL process. The greatest feature is that it focuses on...
Kyana Bosschaerts
If you have worked with data in Python before, you are probably familiar with the following line of code: import pandas as pd The Pandas library is widely seen as the de-facto standard library for data science in Python and rightly so. It offers a...
Louis Dubaere's picture
Louis Dubaere
In this insight we focus on one of the leading tools: MLflow and how MLflow can facilitate and improve your ML lifecycle.
Floriant Sturm's picture
Floriant Sturm
What is Azure Synapse Analytics? Azure Synapse Analytics is the Azure SQL Datawarehouse rebranded. Azure Synapse Analytics v2 (workspaces incl. Azure Synapse Studio) is still in preview. This version of Azure Synapse Analytics integrates existing...
Ivana Pejeva's picture
Ivana Pejeva
On June 18th 2020, Databricks announced that the Apache Spark 3.0 release is available as part of their new Databricks Runtime 7.0. Apache Spark 3.0 itself got released only 8 days prior on the 10th of June. We really love these quick innovation...
Yoshi Coppens's picture
Yoshi Coppens
It’s not easy to work with incremental data in a data lake. If you would want to transform only the data files that just entered your data lake, you would need a notification service, a message queue and/or a batch trigger all to just get the...
Ivana Pejeva's picture
Ivana Pejeva
As the official documentation is not covering this, we have built a guide on how to create an Azure Machine Learning pipeline and how to run this pipeline on an Azure Databricks compute.
Pieter Sterkens's picture
Pieter Sterkens
At element61 we love to work with your data. Our philosophy is providing you a Modern Data Platform in which you can do real-time analytics, big data batch analytics as well as apply Machine Learning and AI.
Ivana Pejeva's picture
Ivana Pejeva
<p>element61 believes Delta features create a big opportunity for anyone starting with a Data Lake or already having a Data Lake. </p>
Ivana Pejeva's picture
Ivana Pejeva

Pages