Best Practices
Overview
This document serves as a guide to write fast, efficient, and cost-effective queries.
Field Selection
When querying the views, select *
queries should be avoided. Instead, only the necessary fields should be specified. The relevant fields will differ depending on the nature of the query.
For example, if the query is meant to retrieve streaming data from one of the Fact views, the following fields are not relevant: TRANSACTION_TYPE
, STORE_STRATA
, DISTRIBUTION_CHANNEL
, PURCHASE_METHOD
. Likewise, for sales data the following fields are not relevant: CONTENT_TYPE
, COMMERCIAL_MODEL
, SERVICE_TYPE
.
Cluster Keys
For best performance, use as many of the cluster keys as possible when writing queries. The following list provides all of the cluster keys that are available per view.
VW_MUSICAL_RECORDING_DS
MR_ID
VW_DAILY_FACT_MR_DS
&VW_DAILY_FACT_MR_SUMMARY_DS
COUNTRY_CODE
MR_ID
REPORT_DATE
VW_SONG_DS
SONG_ID
VW_DAILY_FACT_SONG_DS
&VW_DAILY_FACT_SONG_SUMMARY_DS
COUNTRY_CODE
SONG_ID
REPORT_DATE
VW_MUSICAL_PRODUCT_DS
MP_ID
VW_DAILY_FACT_MP_DS
&VW_DAILY_FACT_MP_SUMMARY_DS
COUNTRY_CODE
MP_ID
REPORT_DATE
VW_MUSICAL_RELEASE_DS
MREL_ID
VW_DAILY_FACT_MREL_DS
&VW_DAILY_FACT_MREL_SUMMARY_DS
COUNTRY_CODE
MREL_ID
REPORT_DATE
VW_MUSICAL_RELEASE_GROUP_DS
MRELG_ID
VW_DAILY_FACT_MRELG_DS
VW_DAILY_FACT_MRELG_SUMMARY_DS
COUNTRY_CODE
MRELG_ID
REPORT_DATE
VW_ARTIST_DS
COUNTRY_OF_ORIGIN
VW_DAILY_FACT_ARTIST_DS
&VW_DAILY_FACT_ARTIS_SUMMARY_DS
COUNTRY_CODE
ARTIST_ID
REPORT_DATE
Filters
Two key fields that should be used for most, if not all, consumption-focused queries are MARKET_ID
and METRIC_CATEGORY
.
When using the larger fact views (e.g. VW_DAILY_FACT_MR_DS
), the MARKET_ID
field is key for limiting the dataset to what is needed for the analysis. The national dataset uses MARKET_ID
-1 for all territories. US & CA have additional Market IDs to specify the various metropolitan markets. If the MARKET_ID
is not specified, the query will double-count activity, which will result in inaccurate, inflated values.
Note: The faster "SUMMARY" fact tables (e.g. VW_DAILY_FACT_MR__SUMMARY_DS
) do not contain market-level data, so the MARKET_ID
filter is not needed.
The METRIC_CATEGORY
field is needed to avoid grouping streaming and sales data together. For example, If you request one week’s worth of Fact data for Taylor Swift without specifying the METRIC_CATEGORY
, the sum of the QUANTITY
may incorrectly aggregate all of her activity (streams, song sales, product sales).
Sample Queries
Several sample queries that incorporate the guidance provided in this document can be found below.
-- One week of national streaming activity for "Dreams" by Fleetwood Mac (breakouts included)
SELECT song_id, market_name, country_code, metric_category, content_type, commercial_model, service_type, SUM(Quantity) AS TotalStreams
FROM extract_s.vw_daily_fact_song_ds
WHERE song_id = 'SG5EA0D6997C3342E9BE4925B9F00312CA'
AND country_code = 'US'
AND market_id = '-1'
AND metric_category = 'Streams'
AND report_date BETWEEN '2024-01-01' AND '2024-01-07'
GROUP BY ALL;
-- List of artists that have between 50,000 and 100,000 daily on-demand streams in France.
SELECT a.artist_id, artist_name, SUM(quantity) AS OnDemandStreams
FROM extract_s.vw_daily_fact_artist_ds AS a
LEFT JOIN extract_s.vw_artist_ds AS b ON a.artist_id = b.artist_id
WHERE country_code = 'FR'
AND market_id = '-1'
AND service_type = 'OnDemand'
AND metric_category = 'Streams'
AND report_date = '2024-06-01'
GROUP BY a.artist_id, artist_name
HAVING OnDemandStreams BETWEEN 50000 AND 100000
ORDER BY OnDemandStreams DESC;
Updated 3 months ago