Best Practices
Overview
This document serves as a guide to write fast, efficient, and cost-effective queries.
Field Selection
When querying the views, select * queries should be avoided. Instead, only the necessary fields should be specified. The relevant fields will differ depending on the nature of the query.
For example, if the query is meant to retrieve streaming data from one of the Fact views, the following fields are not relevant: TRANSACTION_TYPE, STORE_STRATA, DISTRIBUTION_CHANNEL, PURCHASE_METHOD. Likewise, for sales data the following fields are not relevant: CONTENT_TYPE, COMMERCIAL_MODEL, SERVICE_TYPE.
Cluster Keys
For best performance, use as many of the cluster keys as possible when writing queries. The following list provides all of the cluster keys that are available per view.
VW_MUSICAL_RECORDING_DSMR_ID
VW_DAILY_FACT_MR_DETAIL_DS,VW_DAILY_FACT_MR_SUMMARY_DSCOUNTRY_CODEMR_IDREPORT_DATE
VW_SONG_DSSONG_ID
VW_DAILY_FACT_SONG_DETAIL_DS,VW_DAILY_FACT_SONG_SUMMARY_DS,VW_DAILY_FACT_SONG_AIRPLAY_DETAIL_DS,VW_DAILY_FACT_SONG_AIRPLAY_SUMMARY_DSCOUNTRY_CODESONG_IDREPORT_DATE
VW_MUSICAL_PRODUCT_DSMP_ID
VW_DAILY_FACT_MP_DETAIL_DS,VW_DAILY_FACT_MP_SUMMARY_DSCOUNTRY_CODEMP_IDREPORT_DATE
VW_MUSICAL_RELEASE_DSMREL_ID
VW_DAILY_FACT_MREL_DETAIL_DS,VW_DAILY_FACT_MREL_SUMMARY_DSCOUNTRY_CODEMREL_IDREPORT_DATE
VW_MUSICAL_RELEASE_GROUP_DSMRELG_ID
VW_DAILY_FACT_MRELG_DETAIL_DS,VW_DAILY_FACT_MRELG_SUMMARY_DS,VW_DAILY_FACT_MRELG_AIRPLAY_DETAIL_DS,VW_DAILY_FACT_MRELG_AIRPLAY_SUMMARY_DSCOUNTRY_CODEMRELG_IDREPORT_DATE
VW_ARTIST_DSCOUNTRY_OF_ORIGIN
VW_DAILY_FACT_ARTIST_DETAIL_DS,VW_DAILY_FACT_ARTIST_SUMMARY_DS,VW_DAILY_FACT_ARTIST_AIRPLAY_DETAIL_DS,VW_DAILY_FACT_ARTIST_AIRPLAY_SUMMARY_DSCOUNTRY_CODEARTIST_IDREPORT_DATE
Filters
Three key fields that should be used for most, if not all, consumption-focused queries are REPORT_DATE, COUNTRY_CODE, and METRIC_CATEGORY.
The first two fields will ensure that the queries will be properly parametrized and will perform efficiently.
The METRIC_CATEGORY field is needed to avoid grouping streaming and sales data together. For example, If you request one week’s worth of Fact data for Taylor Swift without specifying the METRIC_CATEGORY, the sum of the QUANTITY may incorrectly aggregate all of her activity (streams, song sales, product sales).
When using the larger fact views (e.g. VW_DAILY_FACT_MR_DETAIL_DS), the MARKET_ID field is key for limiting the dataset to what is needed for the analysis. The national dataset uses MARKET_ID -1 for all territories. US & CA have additional Market IDs to specify the various metropolitan markets. If the MARKET_ID is not specified, the query will double-count activity, which will result in inaccurate, inflated values.
Note: The faster SUMMARY fact tables (e.g. VW_DAILY_FACT_MR_SUMMARY_DS) do not contain market-level data, so the MARKET_ID filter is not needed.
Analysis
In order to perform accurate analysis of the fact data, the queries will need to sum all records with related activity for a given timeframe. As discussed in the Fact Data Structure documentation, the fact views contain both positive and negative values. If these records are not properly summed, the analysis could yield numbers that are over or undercounting activity.
Sample Queries
Several sample queries that incorporate the guidance provided in this document can be found below.
-- One week of national streaming activity for "Dreams" by Fleetwood Mac (breakouts included)
SELECT song_id, market_name, country_code, metric_category, content_type, commercial_model, service_type, SUM(Quantity) AS TotalStreams
FROM extract_s.vw_daily_fact_song_detail_ds
WHERE song_id = 'SG5EA0D6997C3342E9BE4925B9F00312CA'
AND country_code = 'US'
AND market_id = '-1'
AND metric_category = 'Streams'
AND report_date BETWEEN '2024-01-01' AND '2024-01-07'
GROUP BY ALL;
-- List of artists that have between 50,000 and 100,000 daily on-demand streams in France.
SELECT a.artist_id, artist_name, SUM(quantity) AS OnDemandStreams
FROM extract_s.vw_daily_fact_artist_detail_ds AS a
LEFT JOIN extract_s.vw_artist_ds AS b ON a.artist_id = b.artist_id
WHERE country_code = 'FR'
AND market_id = '-1'
AND service_type = 'OnDemand'
AND metric_category = 'Streams'
AND report_date = '2024-06-01'
GROUP BY a.artist_id, artist_name
HAVING OnDemandStreams BETWEEN 50000 AND 100000
ORDER BY OnDemandStreams DESC;Updated 22 days ago
