Skip to main content

Join Logic

Transform uses the identifiers to act as the join keys between tables. The Metrics Framework Framework constructs the appropriate joins with other tables based on the identifier types (primary, foreign, or unique).

When creating a data source it's important to think about which other data sources will be able to use the data source to combine metrics and dimensions. The Metrics Framework avoids fan-out and chasm joins by avoiding foreign to foreign, primary to foreign, and unique to foreign joins.

See the table below for a quick reference to what join types the Metrics Framework constructs based on the identifiers.

Identifier Type - Table AIdentifier Type - Table BJoin Type
PrimaryPrimaryLeft
PrimaryUniqueLeft
PrimaryForeignFanout (Not Allowed)
UniquePrimaryLeft
UniqueUniqueLeft
UniqueForeignFanout (Not Allowed)
ForeignPrimaryLeft
ForeignUniqueLeft
ForeignForeignFanout (Not Allowed)

See a quick overview of SQL Joins below!

Framework-JoinTypes

Multi-Hop Joins#

Transform allows users to join measures and dimensions across a graph of identifiers. We call this a 'multi-hop join' as a user can 'hop' across one table to another.

For example: given the schema below, it would be possible in Transform to create the metric 'Average Purchase Price by Country'.

Multi-Hop-Join

Using Multi-Hop Joins

Assuming that the identifiers are set to join the tables together, our Framework can perform multi-hop joins under the hood by creating a graph using the Data Sources as nodes and the join paths as edges.

When you are initiating a query to do a multi-hop join via Transform's MQL Command Line Interface, you need to describe the join path and nodes in the dimension name. To query dimensions without a multi-hop join involved, you use the dimension prefixed with a dunder (double underscore) and identifier. The same paradigm applies for dimensions retrieved by a multi-hop join, except you may have multiple sets of dundered identifiers, since you are hopping across multiple data sources.

An example query in the CLI using this structure would be as follows. Notice how the dimension selection includes two sets of dunders -- one for the dimension user_id__country and one for the dimension country_id__country:

mql query --metrics average_purchase_price dimensions --ds --dimensions user_id__country_id__country_name

In this example - Transform joins the user_signup to sales on user_id, then joins country_dim to user_signup on country_id.

Note: If you have multiple dimensions with the same name, it would be possible for an ambiguous join path to be created. One key way to avoid this is to use a dimension from the list-metrics feature in the CLI, which includes the prefixed identifier that's meant to disambiguate where the dimension comes from.