[go: nahoru, domu]

Skip to content

Hiflylabs/awesome-dbt

Repository files navigation

Awesome dbt Awesome GitHub contributors GitHub commit activity

Welcome to the awesome curated list of dbt resources!

Any kind of contribution is greatly encouraged and appreciated. For making a contribution, please check the contribution guidelines first! Add new entries on the top of sections (LIFO) to keep fresh items more visible! Also, feel free to add new sections.

Happy contributing!

Contents

Get Started

Courses from where you can get started with Analytics Engineering.

How To

Helping hand on setting up integrations and implementing best practices.

Integrations

Collection of known data integrations with dbt

  • dbt-streamdeck - Stream Deck plugin enables you to view the status of models and jobs as actions in your Stream Deck.
  • Auto Alert - diqu - Automate and streamline the alerting/ notification process for dbt test results using this versatile CLI companion tool. Receive detailed alerts & test metadata seamlessly on various platforms, promoting improved collaboration on dbt project issues 🐞🚀.
  • Tabula - Tabula is an end-to-end automation platform for data management tasks.
  • modal-dbt - This repo gives some code to run dbt jobs/actions using modal which is a serverless application framework.
  • Grai - Expose warehouse dbt tests in CI to upstream data consumers so production changes never break the warehouse.
  • Datafold - Gives a quick print out summary of changes so you can move fast and (not) break stuff!
  • Raycast dbt Metadata - Queries the dbt Cloud API to return some useful information about your models (number of tests, time they took to run etc…).
  • Cube - APIs, Caching, and Access Control on top of dbt Metrics.
  • FlexIt Analytics - Business Intelligence platform with deep dbt Cloud and CLI integration.
  • Raycast dbt Jobs - Raycast integration to monitor dbt Cloud Jobs.
  • Metaplane - Data Observaibility layer on top of your dbt + BI project.
  • Dbt + Machine Learning: What makes a great baton pass? - Landscape of ML utilities around dbt.
  • Soda - Integration of Soda's data observability platform and dbt.
  • Supported Adapters - Offically supported database adapters.
  • Lightdash - Open source Looker alternative with deep dbt integration.
  • Superset - Open source visualization layer for your Modern Data Stack.
  • Dagster and dbt: Better Together - Getting started with the dagster-dbt library.
  • fal - Add multi-language support (Python) to your dbt project.
  • Privacy Dynamics - Anonymize data in your dbt project.
  • prefect-dbt - Collection of Prefect integrations for working with dbt with your Prefect flows.
  • Acryl DataHub - A unified data catalog, governance, and observability platform. Use it to view models, docs, test results, and column-level lineage across your dbt projects and downstream dashboards.

User Stories

Use-cases and user stories implemented by the community members using components of the MDS with dbt.

Data Quality

Best-practices and extensions of the testing framework.

  • dq-tools - Make simple storing test results and visualisation of these in a BI dashboard leveraging 6 Data Quality KPIs.
  • BigQuery Stale data detection - Stale data detection with dbt and BigQuery dataset metadata.
  • PipeRider - PipeRider allows you to define the shape of your data once, and then use the data checking functionality to alert you to changes in your data quality.
  • Elementary - A dbt package that provides data anomaly detection as dbt tests.
  • Environment-dependent Unit Testing in dbt - Guide on how to run unit tests in dbt dynamically.
  • dbt-expectations - Port between dbt and great_expectations to extend out-of-the-box tests.
  • re_data - A dbt package for montioring metrics and detect anomalies.
  • How do you test your data - Suggestions on testing your data powered by the community.
  • How to unit test sql transforms in dbt - Unit test using source defer and generic custom tests.
  • DataKitchen Open Source Data Observability - Data breaks. Servers break. dbt and other tools break. Observability and alerting across and down your data estate. Save time with simple, fast data quality test generation and execution.

CI/CD

Make the best out of your product quality and seamless delivery.

Orchestration

Resources to manage and maintain dependencies in modern data pipelines.

Utilities

Useful tools and extensions to bump up your analytics engineer worklow.

  • dbt-column-lineage-extractor - Extract column level linage from dbt projects.
  • tdb - A sweet and speedy code generator for dbt.
  • diff2docs - Turn your diff into docs with the help of GPT-4o.
  • dbt-command-center - A local web application that provides a user-friendly interface to monitor and manage dbt runs.
  • dbt-score - Linter for dbt metadata.
  • dbt-llm-tools - RAG based LLM chatbot for dbt projects.
  • turboYAML - An AI-powered CLI tool for converting dbt SQL files to YAML using OpenAI.
  • datapilot - AI teammate for engineers to ensure best practices in their SQL.
  • dbt-exposures-crawler - Automate the creation of dbt exposures from different sources.
  • dlt(data load tool) - The open-source Python library for data loading.
  • Turntable VSCode extension - A handy docs composer and column-level lineage.
  • dbt-loom - A dbt-core plugin to weave together multi-project dbt-core deployments.
  • dbt-meshify - A dbt-core plugin that automates the management and creation of dbt groups, contracts, access, and versions.
  • dbot - An LLM-powered chatbot with the added context of the dbt knowledge base.
  • dbt-lineagex - A Column Level Lineage Graph for dbt.
  • Jinjat - Low-code application framework that turns your dbt projects into web apps.
  • fst: flow state tool - A tool to help you stay in flow state while developing dbt models.
  • dbt_tld - A self-updating dbt library that will maintain a list of current IANA/ICANN recognized top level domains.
  • dbt-model-finder - A Streamlit web app to find currently running dbt models.
  • dbtc Explorer - A Streamlit web app to explore the dbt Cloud API.
  • dbt-feature-flags - Feature Flags in dbt models.
  • dbtpal - A Neovim plugin for dbt model editing.
  • cookiecutter-dbt - Cookiecutter template for dbt projects.
  • turbovault4dbt - TurboVault4dbt is an open source tool that automatically generates dbt models according to datavault4dbt-templates.
  • dbtvault-generator - Generate DBT Vault files from yml metadata (supporting dbtvault package).
  • dbt-container-skeleton - All the basics to get a nice containerized dbt development environment.
  • oliver-twist - DAG auditing tool that audits the DBT DAG and generates a summary report.
  • dbt-sql-formatter - Makes your sql less bad.
  • dbterd - CLI to generate DBML file from dbt manifest.json.
  • dbt-cue - Generate dbt yml files using the CUE language.
  • dbt-artifacts-parser - It enables us to deal with catalog.json, manifest.json, run-results.json and sources.json as python objects.
  • GitHub Action: Cancel Running CI Job - This allows to always have the newest code commit running in the CI job without having to wait for the stale job runs to finish.
  • dbtc - Unaffiliated python interface to various dbt Cloud API endpoints.
  • dbt-osmosis - Enhance the developer experience significantly with workbench, output diffs, and YAML management.
  • pytest-dbt-core - Pytest dbt core is a pytest plugin for testing your dbt projects.
  • looker-gen - Generate lookml from dbt.
  • dbtenv - A version manager for dbt.
  • sqlfmt - This tool formats your dbt SQL code so you don't have to.
  • SQLFluff - SQL linter that supports dbt and Jinja templating.
  • Build Data Access Layer on dbt - Package to build GraphQL API on top of your dbt project.
  • Run changed models based on Git status - Handy bash function to run changed models since last commit.
  • How we set up our computers for working on dbt projects - Things I wish I would have known when started working with dbt. Tools and hacks to improve developing experience.
  • fzf-dbt - Search dbt models interactively from terminal.
  • vscode-dbt-power-user - VSCode extension to give more clarity on model dependencies.
  • dbt-toolkit - Jetbrains IDE plugin for dbt lineage and more.
  • Your Essential dbt Project Checklist - Checklist on items necessary for a successful dbt project.
  • dbt Style Guide - Developing styleguide often referred in PR templates.
  • Clean your warehouse of old and deprecated models - Clean out warehouse models which are not existent in the project.
  • dbt-tips - Excellent companion to your dbt practice with rich collection of tips.
  • Understanding the scopes of dbt tags - Understanding the scopes of dbt tags.
  • Pre-commit hooks - Pre-commit hooks for checking data integity before schema change commit.

Packages

Community-developed packages to extend default macros and toolset.

  • dbt-snow-mask - A dbt package for Snowflake Dynamic Data Masking.
  • dbt-incremental-stream - A dbt package for Snowflake Streams.
  • dbt-dag-monitoring - A dbt package for monitoring airflow DAGs and tasks.
  • dbt-data-diff - Data-diff solution for dbt-ers with Snowflake.
  • dbt-tags - Tag-based masking policies management in Snowflake.
  • dbt-testgen - Generate dbt tests based on sample data.
  • dbt_otel_export - Takes dbt runs and turns them into OpenTelemetry traces.
  • dbt-assertions - Package to assert rows in-line with dbt macros.
  • dbt-ibis - Write your dbt models using Ibis, the portable Python dataframe library.
  • dbt-timescaledb - The TimescaleDB adapter plugin for dbt.
  • dbt-fake - Daily updated fake data for dbt learning and projects.
  • dbt_cloud_run_cost - Package to calculate dbt Cloud usage-based cost.
  • dbt-reconfigured - A dbt package containing reconfigured macros.
  • dbt-census-utils - A collection of dbt macros for working with Census data.
  • dbt-fabric - A dbt adapter for working with Microsoft Fabric Data Warehouses.
  • dbt-fabricspark - A code package from Microsoft for enabling dbt to work with Synapse Spark in Microsoft Fabric.
  • dbt-fabricsparknb - Dbt Fabric Spark Notebook Generator by Insight based on/forked form dbt-fabricspark by Microsoft.
  • dq-vault - Data Quality Observation of Data Vault layer.
  • dbt-translate - Translate numbers into words.
  • dbt-excel - A dbt adapter for working with Excel.
  • dbt_linreg - Linear regression in SQL using dbt.
  • dbt-snowflake-query-tags - Automatically tag dbt-issued queries with informative metadata.
  • snowflake-resource-monitoring - Yet another package to monitor Snowflake usage.
  • usagedata - Provides insights on the database/table level usage informations from Snowflake.
  • dbt_ml - Package for dbt that allows users to train, audit and use BigQuery ML models.
  • ddbt - This repo represents my attempt to build a fast version of DBT which gets very slow on large projects (3000+ data models). This project attempts to be a direct drop in replacement for DBT at the command line.
  • dbt-snowflake-monitoring - A dbt package to help you monitor Snowflake performance and costs.
  • datavault4dbt - Macros for staging and creation of all DataVault-Entities you need, to build your own DataVault2.0 solution.
  • DDO - Perform DataOps & administrative CI/CD on your data warehouse.
  • dbt-yaml-check - Checks that columns defined in YAML also exist in SQL.
  • data-diff - A command-line tool and Python library to efficiently diff rows across two different databases.
  • dbt-project-evaluator - This package highlights areas of a dbt project that are misaligned with dbt Labs' best practices.
  • dbt_constraints - Generate database constraints based on the tests in a dbt project.
  • dbt-date - Date logic and calendar functionality.
  • dbt-privacy - Macros to make it easier to protect your customers' data.
  • dbt-fivetran-utils - General macros and helpers.
  • dbt_metrics - Macros to support secondary calculations and generate business metrics.
  • dbt-metabase - Model synchronization from dbt to Metabase.
  • dbt-coves - CLI tool for generating a scaffold for your dbt project.
  • dbt-profiler - Data profiling and doc block generator.
  • dbt_utils - General macros library. A must have.
  • dbt_audit_helper - Macros for data audits that compare columns values and schemas between tables.
  • dbt-ml-preprocessing - A SQL port of python's scikit-learn preprocessing module, provided as cross-database dbt macros.
  • dbt-external-tables - Macros to stage your external sources.
  • dbt-feature-store - Macros to build a feature store right within your dbt project.
  • dbt-codegen - Macros that generate dbt code, and log it to the command line.
  • dbt-init - Create a project and populate as much of the dbt project as possible.
  • dbt-artifacts - This package builds a mart of tables from dbt artifacts loaded into a table.
  • dbt-erdiagram-generator - This packages generate ERD diagrams from a dbt project.
  • Terraform-dbt Cloud Module - IAC in dbt Cloud via Terraform.
  • dbt2looker - Generate Looker views for dbt models.
  • dbt-coverage - Checks dbt docs and tests coverage.
  • dbt-meta-testing - Yet another coverage testing.
  • dbt-superset-lineage - Push and pull metadata between dbt to Superset.
  • dbtvault - Package for generating and executing ETL for Data Vault 2.0.
  • dbt-invoke - CLI for creating, updating, and deleting dbt property files.
  • dbt-unit-testing - Package which contains macros to support unit testing.

Snippets

Useful code snippets and templates to speed up your dbt development.

Community

Conferences, meetups, dicussions, newsletters, podcasts, etc. led by fellow analytics engineers and forums of contact.

Sample Projects

Sample projects which work out-of-the box. Reflect use-cases publicly available.

Contributors

Thanks for all the great resources! Can't see your avatar? Check the contribution guide on how you can submit your resources to the community!