What is a data layer and how to structure it for GTM?
The role of the data layer
The data layer is a JavaScript object (an array of objects) that serves as a structured interface between your website and Google Tag Manager. It allows you to pass contextual information (page data, user data, transaction data) to GTM without tags having to scrape the DOM to obtain it.
Concretely, it is a global variable window.dataLayer initialized before the GTM container loads. Each time an event occurs or data becomes available, your site pushes an object into this array via dataLayer.push(). GTM listens to these pushes and fires the corresponding tags. This approach decouples data collection from the presentation layer, making tracking more reliable and maintainable.
Structuring the data layer for e-commerce
For GA4 e-commerce tracking, Google recommends a precise structure based on standardized events: view_item, add_to_cart, begin_checkout, purchase. Each event includes an ecommerce object containing product details (item_id, item_name, price, quantity, item_category) and transaction information (transaction_id, value, currency).
The fundamental rule is to push an ecommerce: null event before each new e-commerce event to clear the previous content and avoid data contamination between funnel steps. This is the most common and hardest-to-diagnose mistake.
Best practices and mistakes to avoid
Initialize the data layer before the GTM snippet in the HTML code. Never directly modify objects already in the array — always use dataLayer.push(). Name your events consistently and document the structure in a tracking plan shared with technical and marketing teams.
Avoid pushing sensitive data (plain text email, banking data) into the data layer: it is accessible to all scripts on the page. For GA4 custom events, make sure that the names and parameters in the data layer match exactly what your GTM tags expect.