Data Transformation using dbt Cloud

dbt Cloud is a hosted enterprise-ready platform that simplifies managing and running transformation pipelines, providing visibility and governance around them. The integration of dbt Cloud with source control repositories simplifies version control of dbt code. This makes collaboration between the development team easy while CI/CD automation enables testing the dbt code before pushing it to production.

Calibo Accelerate supports data transformation using dbt Cloud. You can add dbt as a standalone node in the data transformation stage or you can connect it to a data lake and create a pipeline. In the standalone mode of dbt, you can import dbt projects into the platform and run existing jobs or create and run new jobs. If you connect dbt to a data lake and create a pipeline you can create a new job and run it.

Before you create a dbt transformation job, ensure that the dbt Cloud connection details are added to the Calibo Accelerate platform. See Configure dbt Cloud Connection Details

See

To create a dbt transformation job using a standalone node

Sign in to the Calibo Accelerate platform and navigate to Products.
Select a product and feature. Click the Develop stage of the feature and navigate to Data Pipeline Studio.
Add a data transformation stage and add a dbt Cloud node to the stage.
Click the dbt Cloud node. You can either run an existing job or create a new one.
To run an existing job, perform step 6. To create a new job, perform step 7.
To run an existing job, do the following:
Job Name
- Existing Projects - Select the project that contains the job that you want to run.
- Existing jobs - Select the job that you want to run.
- Environment - Select the environment in which you want to run the job. This field is auto populated and is non-editable.
- Node Rerun Attempts - Specify the number of times the pipeline rerun is attempted on this node of the pipeline, in case of failure. The default setting is done at the pipeline level. You can change the rerun attempts by selecting 1, 2, or 3.
Database Connections

The Warehouse Name and the Database Name are auto populated and are non-editable.

Repository Name

The Repository Name associated with the existing job is displayed along with the Repository Path. Both the fields are auto populated and are non-editable.
Execution Settings
In the Commands section, do the following:
- Run Source Freshness - Enable this setting to ensure that the data in your source tables is refreshed or updated.
- Click the Delete icon to delete the commands that you want to remove for this job.
- Rearrange the sequence of commands by dragging the handler and dropping it.
- + Add Command - Click this option to add new commands to the existing job.
- Generate Docs on Run - Enable this option to automatically generate documentation when a command is executed. The documentation includes schema structure, test results, and versioning information.
Advanced Settings
- Environment Variables - Development - For an existing job this field is auto populated and is non-editable.
- Target Name - If the logic in your dbt project changes depending on the target name, provide a target name, else you can go with the default value.
- Run Timeout - This is the maximum number of seconds for which the run will be executed before being canceled. If it is set to 0 (default setting), the job run is canceled after running for 24 hours.
- dbt Version - The default setting is Latest. Change it as required.
- Threads - This is the number of parallel threads that dbt can use to run multiple models simultaneously, while running a job.
Click Complete.
To create a new job, do the following:
Job Name
- Existing Projects - Select a project to which you want to add the new dbt job.
- Job Name - Provide a name for the job.
- Description - Provide a description for the job.
- Enviroment - Select the environment in which you want to run the job. The environments are created in each dbt project.
- Node Rerun Attempts - Specify the number of times the pipeline rerun is attempted on this node of the pipeline, in case of failure. The default setting is done at the pipeline level. You can change the rerun attempts by selecting 1, 2, or 3.
Database Connections
- Warehouse Name - This field is auto populated and is non-editable.
- Database Name - This field is auto populated and is non-editable.
Repository Name

The Repository Name associated with the selected project is displayed along with the Repository Path. Both the fields are auto populated and are non-editable.
Execution Settings
In the Commands section, do the following:
- Run Source Freshness - Enabling this setting ensures that the data in your source tables is refreshed or updated.
- + Add Command - Click this option to add new commands to the existing job.
- Generate Docs on Run - Enable this option to automatically generate documentation when a command is executed. The documentation includes schema structure, test results, and versioning information.
Advanced Settings
- Target Name - Change the target to a value other than default, if the logic changes for a selected target.
- Run Timeout - Maximum number of seconds for which the run will be executed before being canceled. If it is set to 0 (default setting), the job run is concealed after running for 24 hours.
- dbt Version - The default setting is Latest. Change it to one of the following as per your requirement.
  - Inherited
  - Compatible
  - Extended
- Threads - This is the number of parallel threads that dbt can use to run multiple models simultaneously, while running a job.
Click Complete.
To run the dbt job, do one of the following:

Publish the pipeline. Click Run Pipeline on the home page of DPS.

Publish the pipeline. Click the dbt node and click Start in the side drawer.

What's next? Snowflake Custom Transformation Job