Challenges In The Data World
Data teams are responsible for taking data from sources around their business, transforming it into the desired formats, and serving it up to Data Analysts, Data Scientists and other business users.
Though they've had to do this for a long time, the way in which they solve the problem is coming under increasing pressure as businesses have higher demands in how they use their data than ever before.
The first category of challenges are related to tooling. Data teams have traditionally tended to use heavyweight, proprietary, GUI based and on-premise ETL tools. This dated technology stack can lead to slow delivery and a range of quality and robustness issues due to a lack of support for a modern software delivery lifecycle. They can also fail to scale to todays big data and machine generated datasets.
There are also challenges from a "ways of working" perspective. Data teams have typically worked in a centralised fashion, taking tickets from users and implementing them on their behalf. This leads to delays and bottlenecks as a centralised team can never scale enough to serve their increasingly demanding user base. Unfortunately, because the previous generation of ETL tooling has been quite specialised, it has not been possible to disrupt this model.
How Does dbt Improve This Situation?
dbt is a modern tool for manipulating and transforming data. We can think of it as providing the T or the transformations in the ETL acronym.
dbt is everything that the previous generation of tools are not - lightweight, open source, modular, testable and accessible.
dbt allows us to implement many of the desirable practices which modern Software Engineers are using to improve quality and delivery speed. These include source control, branching and merging, modular clean code and automated testing within our data transformations. Adopting these modern practices can dramatically improve data quality for businesses.
Crucially, dbt also opens up new ways of working and collaborating. Because it is simple to use and transformations are defined in SQL, more people can get involved in contributing to the codebase. This includes Data Analysts and Data Scientists who can build their own pipelines without a limiting dependency on a central team.
The Little Tool That Can
dbt is a very simple tool which you can learn at a surface level in a few hours, but it's also transformative. It will simplify the process by which you transform and analyse your business data, it will add robustness and quality to your data pipelines, and it will break down siloes and improve collaboration in your business. It is truly a game changing tool for how businesses deliver data, as evidenced by it's rapid uptake into a defacto industry standard.
Next Steps
We have a number of resources to help you learn and appreciate the value of dbt.
- Our dbt course at Ensemble
- Our lesson on Data Engineering for higher level context