Can I test the uniqueness of two columns?
Yes, There's a few different options.
Consider an orders tableIn simplest terms, a table is the direct storage of data in rows and columns. Think excel sheet with raw values in each of the cells. that contains records from multiple countries, and the combination of ID and country code is unique:
order_id | country_code |
---|---|
1 | AU |
2 | AU |
... | ... |
1 | US |
2 | US |
... | ... |
Here are some approaches:
1. Create a unique key in the model and test that
models/orders.sql
select
country_code || '-' || order_id as surrogate_key,
...
models/orders.yml
version: 2
models:
- name: orders
columns:
- name: surrogate_key
tests:
- unique
2. Test an expression
models/orders.yml
version: 2
models:
- name: orders
tests:
- unique:
column_name: "(country_code || '-' || order_id)"
3. Use the dbt_utils.unique_combination_of_columns
test
This is especially useful for large datasets since it is more performant. Check out the docs on packages for more information.
models/orders.yml
version: 2
models:
- name: orders
tests:
- dbt_utils.unique_combination_of_columns:
combination_of_columns:
- country_code
- order_id
0