From ETL to ELT: How the Data Processing Landscape has Changed

From ETL to ELT: How the Data Processing Landscape has Changed

The process of ETL (extract, transform and load) has been around for decades. It’s one of the most widely used and well known methods used in data warehousing and business intelligence. However ELT (Extract, Load, Transform) has become increasingly popular in the realm of cloud computing and big data. It is now often seen as the preferred method in the cloud era.

Why is that?

Before the emergence of modern cloud-based data warehouses, data teams did not have the capability to store and compute large amounts of raw data in a single location, in order to transform it into usable data models. Nowadays most modern data warehouses are constructed using highly scalable databases that can efficiently store and compute large amounts of data in a cost-effective way, the joys of the cloud right?

Advantages

The main advantage of ELT is that it allows for more efficient and scalable processing of large data sets, as the data can be transformed and processed in parallel on the target system. This can be especially beneficial when working with big data and cloud-based data warehouses, where powerful computing resources are readily available. ELT also allows for more flexibility in the transformation process, as it can be performed using SQL, Python, Spark or other programming languages directly on the target system, rather than in a separate ETL tool (SSIS anyone?).

However, ELT process requires the target system to be able to handle the raw and transformed data, which may not be feasible for all systems, and may also increase the cost of the target system. Additionally, ELT process may not be suitable for all types of data, e.g. sensitive data, which needs to be transformed and cleansed before being loaded. Here in lies EtLT (I’ll dive into that in another post).

Data Engneers roll in an ELT pipeline

This is an absolute game change in that Data Engineer can create pipelines solely focused on extraction and loading data into warehouses thus enabling data analysts and data scientists to perform transformations, modelling and querying on the data, as they are more familiar working within the context of a database. (Palming off the hard work to Data Analysts and Data Scientists ๐Ÿ˜€ )

What are your views in ELT? As always I encourage you to do some further research and recommend the following for further reading – elt vs etl

Tim