How dbt is transforming the ELT

Oct 18, 2022 . 3 min read . 113 views

dbt is a new data transformation tool that has quietly become the most preferred tool for organisations to do data transformations and manage these transformations.

But before we look at dbt, we need to take a step back to understand the broader landscape.


ETL vs ELT

ETL

Extract -> Transform -> Load

  • This has been the old approach for organisations to access and manage their data
  • Orgs would first extract raw data from tools
  • This data would then go through a transfomration solution such as informatica
  • And finally, the transformed data is stored in a data warehouse

ELT

Extract -> Load -> Transform

  • The ETL approach had many data quality and maintenance issues. Which resulted in huge delays for business to get access to critical data.
  • ELT approach solves this to some extent.
  • With ELT, after extraction data is directly loaded onto a data warehouse or a data lake.
  • Transformations take place in the data warehouse as and when required.
  • This way everyone has access to raw data as well as transformed data.

Enter dbt

DBT is a tool that works with the ELT approach.

Using dbt organisations can transform their data within the warehouse for reporting, ML modelling and operational workflows.


dbt is a compiler and runner.

With dbt users can write transformation logics in sql, dbt then compiles all the code and runs it against the database to generate the transformed dataset.


dbt comes with automatic version control for all transformations

With automatic version control no one has to worry about accidentally messing up a critical pipeline.

dbt ships with a package manager

With the package manager Data Analysts can write packages or preset code that can be used by the organisation to transform data.


Thoughts

Data engineers spend all of their time in data, but they’re not necessarily experts on how it’s used by different business functions to generate insights.

The ELT approach is a much better approach for organisations to manage their data.

dbt makes it much simpler for people outside the data teams to contribute to data workflows.

dbt's overall purpose is to reduce the time to insight and in the process reduce the time to decision.

The future of data collaboration will require Excel power users to learn SQL, and SQL users to get inside the warehouse and build their data models.

Decision makers need to feel comfortable building their own dashboards without having to rely on the data team or analysts.




That's all folks! 🐰


References:
dbt Jetblue case study
What exactly is dbt?
Thanks for reading! If you enjoyed this post, please consider supporting me by following me on Twitter . I would love to connect with you and hear your thoughts.πŸ”₯