Beyond e-commerce, GA4 tracks dozens of engagement events through its enhanced measurement features. Each event type stores its data in different event_params keys, and extracting them follows the same correlated subquery pattern — but knowing which parameters to extract for each event type saves significant discovery time.
These are copy-paste recipes. Adjust the _TABLE_SUFFIX range and project references for your setup.
Page Views with Clean URL Parsing
The page_view event is the most commonly queried GA4 event. The useful parameters live in event_params, not as top-level fields:
SELECT PARSE_DATE('%Y%m%d', event_date) AS event_date, (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'page_location') AS page_url, (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'page_title') AS page_title, (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'page_referrer') AS referrer, -- Extract clean path without protocol or query params REGEXP_REPLACE( REGEXP_REPLACE( (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'page_location'), r'^https?://[^/]+', '' ), r'[\?#].*$', '' ) AS page_path, -- Identify landing pages IFNULL( (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'entrances'), 0 ) = 1 AS is_landing_pageFROM `project.dataset.events_*`WHERE event_name = 'page_view' AND _TABLE_SUFFIX BETWEEN '20240101' AND '20240131'The double REGEXP_REPLACE for page_path is a pattern you’ll reuse across every page-level query. First pass strips protocol and domain (https://example.com/products becomes /products). Second pass strips query parameters and fragments. If you’re building this in dbt, put the path extraction in a reusable macro.
The entrances parameter identifies landing pages — the first page view in a session. It’s an int_value, not a string, so extract it accordingly.
Scroll Depth Analysis
GA4’s enhanced measurement fires a scroll event when users reach 90% scroll depth on a page (this threshold is not configurable). The event itself is simple — the interesting analysis comes from calculating scroll rate against page views:
WITH page_views AS ( SELECT REGEXP_REPLACE( REGEXP_REPLACE( (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'page_location'), r'^https?://[^/]+', '' ), r'[\?#].*$', '' ) AS page_path, COUNT(*) AS pageviews FROM `project.dataset.events_*` WHERE event_name = 'page_view' AND _TABLE_SUFFIX BETWEEN '20240101' AND '20240131' GROUP BY page_path),
scrolls AS ( SELECT REGEXP_REPLACE( REGEXP_REPLACE( (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'page_location'), r'^https?://[^/]+', '' ), r'[\?#].*$', '' ) AS page_path, COUNT(*) AS scroll_events FROM `project.dataset.events_*` WHERE event_name = 'scroll' AND _TABLE_SUFFIX BETWEEN '20240101' AND '20240131' GROUP BY page_path)
SELECT pv.page_path, pv.pageviews, IFNULL(s.scroll_events, 0) AS scroll_events, SAFE_DIVIDE(s.scroll_events, pv.pageviews) AS scroll_rateFROM page_views pvLEFT JOIN scrolls s ON pv.page_path = s.page_pathWHERE pv.pageviews > 100ORDER BY scroll_rate DESCThe HAVING pageviews > 100 filter eliminates low-traffic pages where scroll rate is noisy. Adjust the threshold based on your traffic volume.
A low scroll rate on a long-form article may indicate low engagement past the fold. A high scroll rate on a short page is expected. Scroll rate is most meaningful when compared across similar page types.
Click and Outbound Link Tracking
The click event fires for outbound links (when enhanced measurement is enabled). The key parameters:
SELECT (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'link_url') AS link_url, (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'link_domain') AS link_domain, (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'outbound') AS is_outbound, COUNT(*) AS clicksFROM `project.dataset.events_*`WHERE event_name = 'click' AND _TABLE_SUFFIX BETWEEN '20240101' AND '20240131'GROUP BY link_url, link_domain, is_outboundORDER BY clicks DESCThe outbound parameter is a string 'true' or 'false', not a boolean — type detection stores it as string_value. Filter with = 'true' if you only want external clicks.
File Download Tracking
GA4 automatically tracks downloads of common file types (pdf, xlsx, docx, csv, etc.) when enhanced measurement is enabled:
SELECT (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'file_name') AS file_name, (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'file_extension') AS file_extension, (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'link_url') AS download_url, COUNT(*) AS downloads, COUNT(DISTINCT user_pseudo_id) AS unique_usersFROM `project.dataset.events_*`WHERE event_name = 'file_download' AND _TABLE_SUFFIX BETWEEN '20240101' AND '20240131'GROUP BY file_name, file_extension, download_urlORDER BY downloads DESCfile_name and file_extension are extracted from the URL by GA4’s JavaScript — they can be wrong if the URL doesn’t contain a traditional filename pattern. Always verify against link_url for accuracy.
Video Engagement Funnel
GA4 tracks YouTube embeds automatically (when enhanced measurement is on) through three events: video_start, video_progress, and video_complete. The video_progress event fires at 10%, 25%, 50%, and 75% milestones.
SELECT (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'video_title') AS video_title, (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'video_provider') AS video_provider, COUNTIF(event_name = 'video_start') AS starts, COUNTIF(event_name = 'video_progress' AND (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'video_percent') = 25 ) AS reached_25_pct, COUNTIF(event_name = 'video_progress' AND (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'video_percent') = 50 ) AS reached_50_pct, COUNTIF(event_name = 'video_progress' AND (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'video_percent') = 75 ) AS reached_75_pct, COUNTIF(event_name = 'video_complete') AS completions, SAFE_DIVIDE( COUNTIF(event_name = 'video_complete'), COUNTIF(event_name = 'video_start') ) AS completion_rateFROM `project.dataset.events_*`WHERE event_name IN ('video_start', 'video_progress', 'video_complete') AND _TABLE_SUFFIX BETWEEN '20240101' AND '20240131'GROUP BY video_title, video_providerHAVING starts > 10ORDER BY starts DESCThe nested COUNTIF with a correlated subquery inside is worth studying. It combines event-level filtering (event_name = 'video_progress') with parameter-level filtering (video_percent = 50) in a single aggregation pass. This pattern works for any funnel analysis where you need to count events that match both an event type and a specific parameter value.
The HAVING starts > 10 filter removes low-traffic videos where the funnel percentages are statistically unreliable. Video completion rates vary by content type — 30-40% is typical for marketing videos; instructional content often sees 60%+.
Session Metrics
Session-level aggregation requires the composite session key — user_pseudo_id combined with ga_session_id:
WITH session_events AS ( SELECT CONCAT( user_pseudo_id, '-', (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id') ) AS session_id, user_pseudo_id, event_timestamp, event_name, (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'engagement_time_msec') AS engagement_time_msec, (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'session_engaged') AS session_engaged, traffic_source.source, traffic_source.medium, device.category AS device_category, geo.country FROM `project.dataset.events_*` WHERE _TABLE_SUFFIX BETWEEN '20240101' AND '20240131')
SELECT session_id, user_pseudo_id, MIN(TIMESTAMP_MICROS(event_timestamp)) AS session_start, MAX(TIMESTAMP_MICROS(event_timestamp)) AS session_end, TIMESTAMP_DIFF( MAX(TIMESTAMP_MICROS(event_timestamp)), MIN(TIMESTAMP_MICROS(event_timestamp)), SECOND ) AS session_duration_seconds, COUNT(*) AS session_events, COUNTIF(event_name = 'page_view') AS session_pageviews, SUM(engagement_time_msec) / 1000 AS session_engagement_seconds, MAX(session_engaged) = '1' AS session_is_engaged, ANY_VALUE(source) AS source, ANY_VALUE(medium) AS medium, ANY_VALUE(device_category) AS device_category, ANY_VALUE(country) AS countryFROM session_eventsGROUP BY session_id, user_pseudo_idNote that session_engaged is a string '1', not an integer — another type detection gotcha. And engagement_time_msec is cumulative across events within a session, so SUM gives total engagement time.
For production use, this pattern belongs in a dbt model rather than ad-hoc SQL. See the three-layer architecture for how this fits into a full pipeline.