This repo contains a scuffed version of jaffle_shop
, a fictional ecommerce store. This project will be used to test your refactoring skills. In a nutshell: This dbt project transforms raw data from an app database into models ready for analytics.
However, the project is not well-structured, and the code is not very readable. Your task is to refactor the code to make it more readable and maintainable.
Things to consider:
- Warehouse layers
- Code readability
- Testing
The project contains seeds that includes some (fake) raw data from a fictional app along with some basic dbt models, tests, and docs for this data.
Prerequisities: Python >= 3.8
- Install the project in a virtual environment using your favorite python/env management tool
uv
pipenv
poetry
venv
- ...
- (
uv run
)dbt build
- (
uv run
)dbt docs generate
- (
uv run
)dbt docs serve
-
Ensure your profile is setup correctly from the command line:
dbt --version dbt debug
-
Load the CSVs with the demo data set, run the models, and test the output of the models using the dbt build command:
dbt build
-
Query the data:
Launch a DuckDB command-line interface (CLI):
duckcli jaffle_shop.duckdb
Run a query at the prompt and exit:
select * from customers_with_order_info where customer_id = 42; exit;
Alternatively, use a single-liner to perform the query:
duckcli jaffle_shop.duckdb -e "select * from customers_with_order_info where customer_id = 42"
or:
echo 'select * from customers_with_order_info where customer_id = 42' | duckcli jaffle_shop.duckdb
-
Generate and view the documentation for the project:
dbt docs generate dbt docs serve
-
Load the CSVs with the demo data set. This materializes the CSVs as tables in your target schema. Note that a typical dbt project does not require this step since dbt assumes your raw data is already in your warehouse.
dbt seed
-
Run the models:
dbt run
NOTE: If you decide to run this project in your own data warehouse (outside of this DuckDB demo) and steps fail, it might mean that you need to make small changes to the SQL in the models folder to adjust for the flavor of SQL of your target database. Definitely consider this if you are using a community-contributed adapter.
-
Test the output of the models using the test command:
dbt test
Some options: