Metabase
Important Capabilities
Capability | Status | Notes |
---|---|---|
Platform Instance | ✅ | Enabled by default |
This plugin extracts Charts, dashboards, and associated metadata. This plugin is in beta and has only been tested on PostgreSQL and H2 database.
Dashboard
/api/dashboard endpoint is used to retrieve the following dashboard information.
- Title and description
- Last edited by
- Owner
- Link to the dashboard in Metabase
- Associated charts
Chart
/api/card endpoint is used to retrieve the following information.
- Title and description
- Last edited by
- Owner
- Link to the chart in Metabase
- Datasource and lineage
The following properties for a chart are ingested in DataHub.
Name | Description |
---|---|
Dimensions | Column names |
Filters | Any filters applied to the chart |
Metrics | All columns that are being used for aggregation |
CLI based Ingestion
Install the Plugin
pip install 'acryl-datahub[metabase]'
Config Details
- Options
- Schema
Note that a .
is used to denote nested fields in the YAML recipe.
Field [Required] | Type | Description | Default | Notes |
---|---|---|---|---|
connect_uri | string | Metabase host URL. | localhost:3000 | |
database_alias_map | object | Database name map to use when constructing dataset URN. | ||
default_schema | string | Default schema name to use when schema is not provided in an SQL query | public | |
engine_platform_map | map(str,string) | |||
password | string(password) | Metabase password. | ||
platform_instance_map | map(str,string) | |||
username | string | Metabase username. | ||
env | string | The environment that all assets produced by this connector belong to | PROD |
The JSONSchema for this configuration is inlined below.
{
"title": "MetabaseConfig",
"description": "Any non-Dataset source that produces lineage to Datasets should inherit this class.\ne.g. Orchestrators, Pipelines, BI Tools etc.",
"type": "object",
"properties": {
"env": {
"title": "Env",
"description": "The environment that all assets produced by this connector belong to",
"default": "PROD",
"type": "string"
},
"platform_instance_map": {
"title": "Platform Instance Map",
"description": "A holder for platform -> platform_instance mappings to generate correct dataset urns",
"type": "object",
"additionalProperties": {
"type": "string"
}
},
"connect_uri": {
"title": "Connect Uri",
"description": "Metabase host URL.",
"default": "localhost:3000",
"type": "string"
},
"username": {
"title": "Username",
"description": "Metabase username.",
"type": "string"
},
"password": {
"title": "Password",
"description": "Metabase password.",
"type": "string",
"writeOnly": true,
"format": "password"
},
"database_alias_map": {
"title": "Database Alias Map",
"description": "Database name map to use when constructing dataset URN.",
"type": "object"
},
"engine_platform_map": {
"title": "Engine Platform Map",
"description": "Custom mappings between metabase database engines and DataHub platforms",
"type": "object",
"additionalProperties": {
"type": "string"
}
},
"default_schema": {
"title": "Default Schema",
"description": "Default schema name to use when schema is not provided in an SQL query",
"default": "public",
"type": "string"
}
},
"additionalProperties": false
}
Metabase databases will be mapped to a DataHub platform based on the engine listed in the
api/database response. This mapping can be
customized by using the engine_platform_map
config option. For example, to map databases using the athena
engine to
the underlying datasets in the glue
platform, the following snippet can be used:
engine_platform_map:
athena: glue
DataHub will try to determine database name from Metabase api/database
payload. However, the name can be overridden from database_alias_map
for a given database connected to Metabase.
Compatibility
Metabase version v0.41.2
Code Coordinates
- Class Name:
datahub.ingestion.source.metabase.MetabaseSource
- Browse on GitHub
Questions
If you've got any questions on configuring ingestion for Metabase, feel free to ping us on our Slack.