Using dbt in Snowflake

Set up a dbt project, Group, Warehouse and Service Account in Snowflake.

August 31, 2023

By Xavier de Boisredon

Dbt (data build tool) is a command-line tool that enables data analysts and engineers to transform and model data in the data warehouse. It's like the "T" in "ETL" but for the modern data stack. Dbt does the "transform" part by enabling you to write, document, and test SQL-based data transformation workflows. It does not extract or load data, but it’s extremely good at transforming the data that’s already loaded into your warehouse.

Snowflake is a data warehouse built on top of the Amazon Web Services (AWS) cloud infrastructure. It's designed to be easy-to-use, scalable, and flexible.

To install and configure dbt in Snowflake, follow these steps:

Step-by-Step Guide to Use dbt In Snowflake

Step 1: Install dbt

You can install dbt by using pip, which is the package installer for Python. You can install dbt by running the following command in your terminal:

‍

Step 2: Create a dbt Project

After installing dbt, you can create a new dbt project by running the following command in your terminal:

This will create a new directory called my_project with a basic dbt project structure.

‍

Step 3: Configure your dbt Profile

Dbt uses a file called profiles.yml for connection configurations. This file is usually located in the ~/.dbt directory.

Here is an example profiles.yml file for snowflake:

Replace the placeholder values with your own Snowflake credentials.

‍

Step 4: Edit your dbt_project.yml File

Edit the dbt_project.yml file in your dbt project directory. This file is used to configure your dbt project.

Here is an example dbt_project.yml file:

Example of a dbt_project.yml file in your dbt project directory to use dbt in Snowflake.

‍

Step 5: Run dbt Commands

Now you can run dbt commands to interact with your Snowflake database. Here are some common dbt commands:

dbt run: This command will run all models in your project.
dbt test: This command will run tests on your models.
dbt seed: This command will load CSV files into your database.
dbt docs generate: This command will generate documentation for your project.

Remember to always run dbt debug to check your connection before running any other dbt commands.

That's it! You have successfully installed and configured dbt Python in Snowflake.

‍

Snowflake Best Practices for dbt

Before diving into dbt, it is crucial to ensure that we are adhering to the best practices in Snowflake. There are a few preliminary configurations to set up before starting with dbt.

First, it is highly advised to configure your credentials to use a key-pair authentication mechanism (we will discuss other configurations later). This is more secure than using a username and password. Second, it is recommended to create a dedicated group and warehouse for dbt. This ensures that the dbt group has the necessary permissions to create and read various objects. Additionally, having a dedicated warehouse ensures that dbt transformations are isolated in terms of compute and memory resources. This isolation helps in monitoring and managing performance expectations.

Lastly, if you plan to automate dbt tasks, it is wise to create a dedicated dbt service account. This is important to decouple the automation from an employee's account, which may become inactive if the employee leaves the organization. Using a service account ensures that the automation continues to run smoothly even if an employee departs.

Setting Up Key-Pair Authentication

Snowflake supports key-pair authentication, which is more secure than using a username and password. Here are the steps to set up key-pair authentication:

Generate a key pair.
Upload the public key to Snowflake.
Configure the dbt profile to use the private key.

Generate a Key Pair

You can generate a key pair using the openssl command-line tool. Run the following commands in your terminal:

Upload the Public Key to Snowflake

Log in to the Snowflake web interface and navigate to the 'Users' tab. Select the user you want to configure and add the public key to the 'RSA Public Key' field.

Configure the dbt Profile

Edit your profiles.yml file to configure the dbt profile to use the private key. Here is an example profiles.yml file that uses key-pair authentication:

Here is an example **profiles.yml** file that uses key-pair authentication. Replace the placeholder values with your own Snowflake configuration.

‍

Creating a dbt Group and Warehouse

It is a good practice to create a dedicated group and warehouse for dbt in Snowflake. This ensures that the dbt group has the necessary permissions to create and read various objects, and that dbt transformations are isolated in terms of compute and memory resources.

Create a dbt Group

Log in to the Snowflake web interface and navigate to the 'Roles' tab. Create a new role called 'DBT_ROLE' and grant the necessary permissions to this role.

‍

Create a dbt Warehouse

Navigate to the 'Warehouses' tab and create a new warehouse called 'DBT_WAREHOUSE'. Configure the warehouse size and other settings as needed.

‍

Creating a dbt Service Account

If you plan to automate dbt tasks, it is recommended to create a dedicated dbt service account. This ensures that the automation is not tied to an employee's account and continues to run smoothly even if an employee leaves the organization.

Create a dbt Service Account

Log in to the Snowflake web interface and navigate to the 'Users' tab. Create a new user called 'DBT_SERVICE_ACCOUNT' and configure the user with the necessary settings.

‍

Grant Permissions to the dbt Service Account

Grant the necessary permissions to the dbt service account.

GRANT ROLE DBT_ROLE TO USER DBT_SERVICE_ACCOUNT

Conclusion

By following these best practices, you can ensure that your dbt setup in Snowflake is secure, scalable, and ready for automation. Remember to always use key-pair authentication, create a dedicated group and warehouse for dbt, and use a dedicated service account for automation.

‍

New Release

Table of Contents

Why Look for Atlan Alternative?

Resources

Xavier de Boisredon

June 8, 2023

What are dbt Tags?

Explore the world of dbt Tags with our in-depth analysis, covering everything from their definition to their real-world applications in data processing. Discover how these identifiers enhance data management, streamline operations, and bolster project documentation. Whether you're new to Data Build Tools or looking to optimize your existing processes, our comprehensive guide on 'What are dbt Tags?' will be your go-to resource.

Learn more

Dani Kellogg

February 16, 2023

Now Live: Sync Back to dbt

Sync Back to dbt is Castor's newest feature, it helps you easily keep your data’s documentation close to your code. Get started today!

Learn more

Xavier de Boisredon

July 28, 2023

Top 5 dbt utils use-cases

Learn more

Get in Touch to Learn More

See Why Users Love CastorDoc

Fantastic tool for data discovery and documentation

“[I like] The easy to use interface and the speed of finding the relevant assets that you're looking for in your database. I also really enjoy the score given to each table, [which] lets you prioritize the results of your queries by how often certain data is used.” - Michal P., Head of Data

Step-by-Step Guide to Use dbt In Snowflake

Step 1: Install dbt

Step 2: Create a dbt Project

Step 3: Configure your dbt Profile

Step 4: Edit your dbt_project.yml File

Step 5: Run dbt Commands

Snowflake Best Practices for dbt

Setting Up Key-Pair Authentication

Generate a Key Pair

Upload the Public Key to Snowflake

Configure the dbt Profile

Creating a dbt Group and Warehouse

Create a dbt Group

Create a dbt Warehouse

Creating a dbt Service Account

Create a dbt Service Account

Grant Permissions to the dbt Service Account

Conclusion

You might also like

Get in Touch to Learn More