dbt build vs dbt run
dbt run = execute your models. dbt build = execute your models and test them.
dbt (short for "data build tool") is a popular open-source software application used for transforming and testing data in analytics pipelines, typically stored in data warehouses. It's written in Python and uses SQL to define transformations. dbt allows data analysts and engineers to transform data by creating, documenting, and executing SQL-based data transformation workflows.
What is dbt run?
dbt run command compiles and runs the dbt models in your project. When you run dbt run, it will execute all your model SQL files and create or replace tables/views in your data warehouse. The execution order is determined based on the dependencies you've defined in your models. You can run specific models or exclude certain models using tags or the model's name.
What's it Used for?
"dbt run" select is a bit like trying on a new jacket after tailoring it - you make changes to your SQL code and then use this command to see how those changes fit in the database. If you've ever done any coding, it's like pressing that "compile" button to make sure everything's sewn together just right. It's the "try before you buy" step in your data world, making sure everything sits comfortably.
Typical Use-cases for dbt run
- Updating Existing Models: If you've already got some dbt models in place and need to update them, "dbt run" can be your friend.
- Testing Code Changes: Made a change to your SQL code and want to see how it behaves? "dbt run" it is!
What is dbt build?
Introduced in dbt v0.18.0, dbt build is a composite command that runs both dbt run and dbt test in a single step. It's essentially a shortcut for executing both commands. Running dbt build will first run your models and then immediately test them. This ensures that the transformations are correct and meet the data quality checks you've defined. Just like with dbt run, you can specify which models to build or exclude.
What's it Used for?
"dbt build" is for those moments when you need a one-step process. You want to run your models, but you also want to test them and maybe do a few other things. With "dbt build", you don't have to run separate commands for each step.
Typical Use-cases for dbt build
- Streamlined Workflow: Want to cut down on time and confusion? Think of "dbt build" as your shortcut through the maze of data work. It's like having a fast pass at an amusement park, getting you where you need to go with fewer hassles.
- Combined Testing and Building: If you're the type who likes to taste the soup as you cook it, "dbt build" is right up your alley. With its nifty built-in testing feature, you can make sure everything's cooking just right without having to reach for another tool. Cool, right?
Main Differences between dbt build and dbt run
When to use dbt build vs dbt run?
Use "dbt build" for an overall data transformation workflow when dealing with complex dbt projects that require a combination of compiling, testing, and dbt snapshots.
"dbt run" is best for more granular tasks, such as when you only need to run the models in specific cases, or when you're working with a dbt seed or incremental model strategy.
Conclusion
- Use dbt run when you only want to execute your models.
- Use dbt build when you want to execute your models and then immediately test them.
For those frequently testing their transformations as part of their development workflow, dbt build can be a time-saver since it combines two commonly used commands into one.
Unlock a Seamless Data Experience with CastorDoc
Mastered dbt commands like run and build? Enhance your workflow further with CastorDoc. Our recent Sync Back feature not only keeps your data documentation close to your code but also pushes table and column descriptions back to dbt directly from CastorDoc. Seamlessly connect your data transformations with robust documentation. Try CastorDoc today.
You might also like
Get in Touch to Learn More
“[I like] The easy to use interface and the speed of finding the relevant assets that you're looking for in your database. I also really enjoy the score given to each table, [which] lets you prioritize the results of your queries by how often certain data is used.” - Michal P., Head of Data