Cloud Run is the recommended deployment platform for GTM Server-Side as of October 2023, replacing App Engine. It runs the official Google tagging server Docker image as a fully managed serverless container — you don’t manage the underlying infrastructure, but you do control the configuration and region.
Custom Domain: Non-Negotiable First Step
Before touching Cloud Run, you need a custom subdomain pointed at your server container. This isn’t optional. Without a custom domain, cookies are set in a third-party context and you get none of the first-party cookie benefits that make server-side tracking worthwhile.
Two approaches:
Subdomain approach (most common): Create a CNAME record pointing gtm.yourdomain.com to your Cloud Run service URL. Google handles SSL automatically via managed certificates, though provisioning can take up to 24 hours. This approach is straightforward but has one limitation: Safari 16.4’s IP address check means the CNAME approach alone doesn’t guarantee full cookie lifetime recovery.
Same-origin approach (advanced): Use a CDN or Cloud Load Balancer with path-based routing — for example, routing /metrics/* to Cloud Run while the rest of the domain serves your website normally. This gives the strongest first-party cookie treatment because the server shares the exact same origin as your website, resolving the IP mismatch problem entirely. It requires more infrastructure setup but is the correct long-term architecture for sites where Safari attribution matters.
Verify your deployment is live by hitting https://gtm.yourdomain.com/healthy — it should return “OK.”
Automatic vs Manual Provisioning
GTM offers an automatic provisioning flow that creates the Cloud Run services directly from within the GTM interface. It’s faster to set up but has one significant limitation: automatic provisioning always deploys to US Central (Iowa), regardless of where your users are.
Manual provisioning is recommended for production. It lets you:
- Select a region closer to your users for lower latency
- Keep data within European infrastructure for GDPR compliance when serving EU traffic
- Control exactly which project and service account is used
The manual flow involves deploying Cloud Run yourself with the official container image, then pasting the service URL back into GTM’s server container settings.
Production Configuration
Google’s recommended settings for the tagging server:
Min instances: 2 # eliminates cold startsMax instances: 10 # handles approximately 350 req/sCPU: 1 vCPU, always allocatedMemory: 512 MiBRequest timeout: 60 secondsThe --no-cpu-throttling flag (CPU always allocated) is critical. Without it, Cloud Run throttles CPU between requests. For a server that needs to process and forward data quickly — often making multiple outbound HTTP calls per request — CPU throttling causes inconsistent latency and can cause timeouts and data loss under load.
Cold starts matter more for tracking than for most applications. A cold start on a request to /g/collect means that hit gets dropped or delayed. Minimum 2 instances eliminates cold starts by keeping the container warm at all times.
The preview server (dedicated to GTM debug mode) must run exactly 1 instance. Scaling it beyond 1 instance causes debug mode to behave unpredictably because debug state is stored in-process and multiple instances don’t share it.
Multi-Region Architecture
For global traffic, a single regional Cloud Run deployment introduces latency for users far from your chosen region. The pattern for worldwide coverage:
- Deploy Cloud Run services across regions (e.g.,
europe-west1,us-east1,asia-east1) - Create Serverless Network Endpoint Groups (NEGs) for each regional service
- Place them behind a single External Application Load Balancer with a global anycast IP
- Configure geographic routing rules on the load balancer
The load balancer adds approximately $18/month in base costs but provides two additional benefits: Cloud Armor DDoS protection and geographic traffic steering that routes EU users to your EU instance (relevant for GDPR data residency requirements).
For most setups with a European-heavy or US-heavy audience rather than truly global traffic, a single regional deployment in the right region is sufficient. Deploy to europe-west1 for EU-primary traffic or us-east1 for US-primary traffic and skip the load balancer complexity.
The Cloud Logging Trap
Default Cloud Run settings log every request to Cloud Logging. At $0.50 per GiB ingested, default logging for moderate tracking traffic adds $100-$220/month — often exceeding compute costs.
Disable detailed request logging after deployment. Error logging and sampled request logging are sufficient for debugging; per-request logging at full detail is not needed. This is typically the largest unexpected cost in a new GTM Server-Side deployment.
Tagging Server vs Preview Server
Your server-side deployment creates two separate Cloud Run services:
Tagging server — handles production traffic. Apply the production configuration above (min 2 instances, CPU always allocated). This is what gtm.yourdomain.com points to.
Preview server — handles GTM debug/preview mode sessions. Run this at exactly 1 instance. Debug sessions connect to a specific preview server instance and expect state to persist across requests in the same session. Multiple instances break this expectation.
The preview server doesn’t need the same capacity as the tagging server. It’s only active when someone has the GTM preview panel open, not during normal user traffic. Keep it minimal.
Connecting the Container
Once Cloud Run is deployed, you connect it to your GTM server container by:
- In GTM, go to your server container settings
- Set the “Tagging Server URL” to your custom domain (e.g.,
https://gtm.yourdomain.com) - Optionally set the preview server URL if you deployed it separately
In your web container, update the Google Tag configuration’s server_container_url field to your custom domain. This redirects all GA4 hits through your server instead of directly to Google’s collection endpoints. Without this web container change, your server infrastructure sits idle — the browser is still sending hits directly to Google.