description | keywords | ||
---|---|---|---|
dlt source for notion.com |
|
Currently loads the following data:
Table | Contains |
---|---|
persons |
Items of the user model of type person |
bots |
Items of the user model of type bot |
database_<name>_<shorthash> |
The database content (pages) of a given database. Columns are all global page columns + selected columns of properties |
databases |
All metadata of each loaded database. Has the title and a reference to the database table as well. |
databases__properties |
Column <-> Label mapping for each unfiltered database property |
options_<name>_<shorthash> |
All options of a select or multi_select in one of the loaded databases. |
<shorthash>
refers to the hashed id
of an entity.
The hash is a 4-bit BLAKE2b hash.
The official verified source has a few drawbacks:
- on usage of the verified source, a copy of the current state of
the
dlt-hub/verified-sources
repository is copied into your project; Once you make changes to it, it effectively becomes a fork, making it hard to update after the fact. - This makes use of a preexisting client implementation
- And Pydantic types for Notion entities, which makes this implementation a lot more stable
Create a .dlt/secrets.toml
with your API key and email:
notion_token = "ntn_abcd12345"
and then run the default source with optional list references:
from dlt_source_notion import source as notion_source
pipeline = dlt.pipeline(
pipeline_name="notion_pipeline",
destination="duckdb",
dev_mode=True,
)
notion_data = notion_source()
pipeline.run(notion_data)
This project is using devenv.
Commands:
Command | What does it do? |
---|---|
format |
Formats & lints all code |
sample-pipeline-run |
Runs the sample pipeline. By default dev_mode=True which fetches resources with a limit of 1 (page) |
sample-pipeline-show |
Starts the streamlit-based dlt hub |
NOTION_TOKEN=[...] \
sample-pipeline-run
alternatively you can also create a .dlt/secrets.toml
(excluded from git) with the following content:
notion_token = "ntn_abcd12345"