RevOps

Designing a GTM Data Model With One Source of Truth

How to design a go-to-market data model that eliminates conflicting metrics across sales, marketing, and customer success — covering object hierarchy, field governance, metric definitions, and the reporting layer that makes the data trustworthy.

SaaS Science TeamJune 14, 202613 min read
revopsdata modelgtmsaas metricsreportingcrm architecturedata governance

Designing a GTM Data Model With One Source of Truth

The marketing team says MQL volume is up 40% this quarter. The sales team says lead quality has never been worse. Finance is trying to reconcile why the pipeline number in the CRM is different from the number in the marketing dashboard. The CEO asks for the correct number. Three different people give three different answers, all of which are technically correct by their own definitions.

This is the single-source-of-truth problem, and it is nearly universal in B2B SaaS companies at the $3M–$15M ARR stage. It is not caused by bad data — it is caused by the absence of a designed data model. Each team built their reporting independently, using their own definitions for shared concepts, storing data in the system they own without coordinating with other systems.

The GTM data model is the architectural solution. It defines how customer and revenue data is organized, what each piece of data means, how it is related to other data, and how it flows between systems. When the data model is designed correctly, all teams calculate the same metrics from the same source, and the single-source-of-truth problem disappears — not by magic, but by design.

See Your Growth Ceiling NowTry Free

The Object Hierarchy: The Foundation of the Data Model

The CRM object hierarchy is the most consequential early architecture decision in the GTM data model. It determines how accounts, contacts, leads, and opportunities relate to each other — and whether the resulting data structure supports account-based reporting, multi-contact attribution, and territory management.

The standard B2B SaaS object hierarchy:

Account (also called Company in HubSpot): Represents a business entity — the organization that is a prospect or customer. All other objects roll up to the Account. The Account is the unit of territory assignment, account-based marketing targeting, and enterprise renewal management.

Contact: Represents an individual at the Account. An Account can have multiple Contacts (champion, economic buyer, technical evaluator, procurement). Each Contact has their own engagement history, lead score, and lifecycle stage.

Lead: A pre-qualification object that exists before an individual has been confirmed as a relevant Contact at a target Account. In Salesforce, Leads and Contacts are separate objects; Leads are converted to Contacts (and associated with an Account) upon qualification. In HubSpot, this distinction does not exist — the Contact object serves both purposes.

Opportunity (also called Deal in HubSpot): Represents an active sales process for a specific Account. The Opportunity is the primary object for pipeline management, revenue forecasting, and closed-won revenue tracking. Every dollar of new ARR flows through the Opportunity object.

The critical architecture rule: Revenue data lives at the Opportunity level, not at the Lead or Contact level. A common mistake is tracking deal progress on the Lead object, which creates records that cannot be included in account-based pipeline reports, cannot be associated with the correct Account for territory management, and are invisible to the customer success team when the deal closes.

The Lead-to-Opportunity conversion workflow should be triggered at MQL qualification: the Lead is converted to a Contact, associated with the relevant Account (created if it does not exist), and an Opportunity is created to track the sales process. After this conversion, the Lead object is typically archived or merged and the Contact object becomes the primary record.

Defining the Object Relationships

Beyond the primary hierarchy, several relationship types must be defined explicitly:

Contact Roles on Opportunities: Each Opportunity should have multiple Contacts associated with it, each with a defined Role (Champion, Economic Buyer, Technical Evaluator, Procurement, End User, Other). Contact Roles enable multi-threaded deal management and provide post-close attribution data for marketing. Without Contact Roles, there is no data on who was involved in the buying decision.

Account Hierarchy: Enterprise companies have subsidiaries. A pilot at a subsidiary may create a second Opportunity at the parent account for an enterprise rollout. The CRM data model must support parent-child Account relationships so that revenue from subsidiary accounts rolls up correctly to the parent for territory reporting and expansion analysis.

Partner and Reseller Relationships: If the company sells through channel partners, the data model needs to represent the three-way relationship: the Partner account, the End Customer account, and the Opportunity. Many CRM data models handle this poorly — partner deals end up orphaned from the end customer account, making it impossible to track customer health or renewal at the correct account level.

Product Line on Opportunity: If the company sells multiple products or tiers, each Opportunity should be associated with a specific product line. Opportunity Products (Salesforce) or Line Items (HubSpot) capture the specific configuration sold, enabling product-line revenue analysis, gross margin by product, and attach rate reporting.

Metric Definitions: Making the Implicit Explicit

The single-source-of-truth problem is not primarily a technical problem — it is a definitions problem. When "MQL" means different things to marketing and sales, the two teams will always have different MQL counts, and both counts will be technically correct.

Every GTM metric should have a formal definition with these components:

Metric name: Short, unambiguous label used consistently across all reports and conversations.

Plain-language description: What does this metric measure, in language that a non-technical stakeholder can understand without domain expertise?

Calculation formula: The specific formula used to calculate the metric, referencing the exact CRM fields used. For example: MQL Rate = COUNT(Contacts WHERE Lifecycle Stage = "MQL" AND MQL Date is in the reporting period) / COUNT(Contacts WHERE First Activity Date is in the reporting period).

Inclusion criteria: Which records are included in the calculation? Which dates, which stages, which deal types?

Exclusion criteria: Which records are excluded? Internal test accounts, employees, blacklisted domains, deals below minimum ACV threshold?

Time dimension: Is this a snapshot metric (the count as of the end of the reporting period), a flow metric (the count of events occurring during the reporting period), or a trailing average?

Reporting owner: Which team or individual is responsible for maintaining this metric's definition and reporting it accurately?

A practical set of GTM metrics that require formal definition for most B2B SaaS companies:

  • MQL (Marketing Qualified Lead) — definition and thresholds
  • SAL (Sales Accepted Lead) — definition and acceptance criteria
  • SQL (Sales Qualified Lead) — definition and conversion criteria
  • Pipeline Bookings — does this include renewals? expansions? partner-sourced deals?
  • New ARR — annual recurring revenue from net new customers only
  • Expansion ARR — additional ARR from existing customers (upsells, cross-sells, seat additions)
  • Churned ARR — ARR lost from customer cancellations
  • Net New ARR — New ARR + Expansion ARR - Churned ARR
  • Win Rate — Closed Won / (Closed Won + Closed Lost) in the period
  • Average Sales Cycle Length — average days from Opportunity Created to Closed Won

Store these definitions in a shared data dictionary — a wiki page, a Notion database, or a Google Doc with version history — and link to it from every dashboard that uses these metrics. When a definition changes, update the dictionary and communicate the change to all stakeholders before the next reporting period.

For a deeper look at ARR metric definitions and their interaction with the forecasting model, see SaaS ARR Forecasting and SaaS MRR Forecasting Rigor.

Field Governance: Controlling How Data Enters the Model

The data model design is only as good as the data quality it generates. Field governance defines which fields exist on each object, what values are allowed (free text vs. picklist), which fields are required for records to move to the next stage, and who can edit which fields.

Field inventory principles:

Minimum required fields: For each object, define the minimum set of fields required for the record to be useful in downstream reporting. A Contact record is useful only if it has a work email, a company association, and a lifecycle stage. Make these fields required on the record creation form.

Picklist vs. free text: Fields used in filtering and grouping in reports should always be picklists (controlled vocabulary), not free text. Industry, Company Size Segment, Lead Source, and Lifecycle Stage should all be picklists. Free text fields like Contact Name, Company Name, and Next Steps are appropriate for narrative content, not for dimension-based reporting.

Immutable fields: Some fields should be set once and never overwritten. First Touch Source, MQL Date, and Lead Created Date are examples. Protect these fields from manual editing and from automation workflows that might update them incorrectly.

Calculated fields: Some fields are calculated from other fields and should never be manually edited. Days in Stage (calculated from Stage Entry Date), Lead Score (calculated from engagement + fit scoring), and Pipeline Coverage Ratio (calculated from Pipeline Value / Quota) should be formula fields or automatically-updated fields, not manually maintained.

Field change governance: When a field definition or allowed values change — a new industry category is added to the picklist, an obsolete lifecycle stage is removed — the change must be managed carefully. Changing a picklist value retroactively creates reporting inconsistencies: records that previously had value "SMB" now need to be mapped to the new "Small Business" value. Build a change management process for field modifications that includes backward-compatible mapping for historical data.

The Reporting Layer: Making the Data Actionable

A well-designed data model without a well-designed reporting layer is like a clean database with no interface. The reporting layer translates raw data into the dashboards, reports, and alerts that teams use to make decisions.

Report governance principles:

Canonical reports: For each key metric, designate a single canonical report that is the official source for that number. When teams debate a metric, they go to the canonical report, not to individually-built views. The canonical report is owned by RevOps, uses the formal metric definition, and is the only report used in leadership presentations.

Naming conventions: Every report and dashboard should follow a naming convention that identifies the owner team, the time dimension, and the metric type. Examples: "RevOps | Weekly Pipeline Coverage | Live" and "Marketing | Monthly MQL Trend | Last 12 Months." Consistent naming prevents the proliferation of identically-named reports with slightly different calculations.

Access controls: Some reports contain data that should not be broadly visible — individual rep performance data, compensation details, or strategic pipeline information. Use role-based access controls in the CRM to restrict sensitive reports to appropriate audiences.

Report audit schedule: Every quarter, audit the canonical reports against the metric definitions. Verify that the filter criteria, field references, and calculation logic still match the formal definitions. Reports can drift from their definitions when fields are renamed, picklist values are added, or object relationships are changed without updating the reports.

Dashboard design for GTM alignment:

The executive revenue dashboard should show, on a single screen: current-quarter pipeline vs. target, Closed Won ARR to date vs. target, MQL volume and MQL-to-SAL conversion rate, Churn ARR to date, and Net New ARR. This dashboard should draw from the CRM only — not from multiple systems — and should refresh daily.

The marketing team dashboard should show: total leads by source, MQL volume and conversion rate, pipeline generated by marketing source, and cost per MQL and cost per opportunity by channel.

The sales team dashboard should show: individual rep pipeline by stage, rep activity metrics (calls, emails, meetings logged), forecast by rep, and win rate by deal size segment.

Each dashboard uses the same underlying data and the same metric definitions. When the marketing dashboard shows an MQL count and the sales dashboard shows a pipeline count, the two numbers should be traceable to the same source data through the documented conversion rates in the lifecycle stage model.

Integrating the Data Warehouse for Cross-System Analytics

For companies at $10M+ ARR where GTM analytics require data from multiple systems — CRM, billing, product analytics, marketing automation — a data warehouse provides the integration layer that makes cross-system reporting possible.

The warehouse pulls data from each source system on a scheduled basis (typically nightly) and stores it in a unified schema. Data transformation logic — converting billing system subscription events to ARR calculations, joining product usage data to CRM account records, calculating derived metrics like LTV and CAC payback — runs in the warehouse rather than in individual source systems.

A practical warehouse stack for a B2B SaaS company:

  • Extraction: Fivetran or Airbyte for automated extraction from source systems (CRM, billing, product database, marketing automation)
  • Storage: Snowflake, BigQuery, or Redshift depending on existing cloud provider preferences
  • Transformation: dbt (data build tool) for SQL-based data transformation and documentation
  • Visualization: Looker, Metabase, or Mode for dashboard and report creation on top of the warehouse

The data dictionary that governs metric definitions at the CRM level extends to the warehouse. Every metric calculated in dbt should have a YAML-documented description that references the source fields, the transformation logic, and the metric definition it implements. This documentation is the single source of truth for what each metric means — at the data model level, not just at the reporting layer.

For how the GTM data model connects to lifecycle stage definitions, see Defining Lead Lifecycle Stages That Sales and Marketing Both Trust. For how it connects to CRM maintenance, see CRM Data Hygiene Automation Rules.

Frequently Asked Questions

What is a GTM data model?

A GTM data model is the structured definition of how customer, prospect, and revenue data is organized across the systems used by sales, marketing, and customer success. It defines the objects, the relationships between them, the fields on each object, the governance rules for data entry, and the metric definitions calculated from the underlying data.

What is the most common GTM data model mistake?

Storing revenue and opportunity data at the Lead object level rather than converting to Account + Contact + Opportunity. This creates records that cannot be included in account-based reporting. The second most common mistake is allowing multiple definitions of the same metric to coexist across teams.

How do you define metrics so they are unambiguous?

Each metric definition should include: metric name, plain-language description, calculation formula referencing specific CRM fields, inclusion and exclusion criteria, time dimension, and the owner responsible for maintaining the definition.

When do you need a data warehouse?

A data warehouse is typically justified at $10M+ ARR or when cross-system reporting needs — combining CRM, billing, and product analytics data — exceed what CRM-native reports can provide.

How do you handle attribution data in the GTM data model?

Attribution requires: First Touch Source (set once, never overwritten), Last Touch Source (updated with each interaction), and Original Source Detail (UTM parameters from first visit). Multi-touch attribution requires storing every marketing interaction as a separate event record.

Conclusion

The GTM data model is not a technical artifact maintained by the data engineering team — it is a business governance artifact that determines whether all teams are working from the same understanding of the business. When it is designed intentionally, with formal object hierarchy definitions, agreed-upon metric definitions, field governance rules, and a consistent reporting layer, the single-source-of-truth problem is solved by architecture rather than by argument.

The investment is significant. The payoff — revenue metrics that every stakeholder trusts, reporting that does not require weekly reconciliation, and a data foundation that scales as the company grows — is worth it.

See Your Growth Ceiling Now

Calculate when your SaaS growth will plateau — free, no signup required.

Calculate Your Growth Ceiling

Frequently Asked Questions

What is a GTM data model?
A GTM (go-to-market) data model is the structured definition of how customer, prospect, and revenue data is organized across the systems used by sales, marketing, and customer success teams. It defines the objects (accounts, contacts, leads, opportunities), the relationships between them, the fields that live on each object, the governance rules for how data is entered and maintained, and the metric definitions that are calculated from the underlying data.
Why do SaaS companies struggle with a single source of truth?
The primary causes are: multiple systems storing overlapping data (CRM, marketing automation, billing, product analytics) with inconsistent sync, metric definitions that were never formally agreed upon and codified, CRM data that is manually entered without validation rules, and the absence of a data governance process to catch and resolve inconsistencies before they become entrenched in reporting.
What is the correct CRM object hierarchy for B2B SaaS?
The standard B2B SaaS CRM hierarchy is: Account (company) → Contact (individual at the company) → Opportunity (active sales process for the account). Leads are a pre-conversion object that exists before a Contact and Account are confirmed — they should be converted to Contact + Account upon MQL qualification. In HubSpot, the equivalent objects are Company, Contact, and Deal. The key architectural rule is that all revenue-related data should eventually exist at the Opportunity/Deal level, not at the Lead level.
What is the most common GTM data model mistake?
Storing revenue and opportunity data at the Lead object level rather than converting to Account + Contact + Opportunity. This creates orphaned records that cannot be included in account-based reporting, pipeline analysis, or attribution models. The second most common mistake is allowing multiple definitions of the same metric to coexist across teams — for example, marketing using one MQL definition for their reports and sales using a different definition for theirs.
How do you define metrics so they are unambiguous?
Each metric definition should include: the metric name, the plain-language description of what it measures, the specific CRM fields used to calculate it, the inclusion and exclusion criteria, the time dimension (snapshot, point-in-time, or trailing period), and the owner responsible for maintaining the definition. Metric definitions should be stored in a shared wiki or data dictionary, referenced in every report that uses the metric, and reviewed quarterly.
What is the role of a data warehouse in the GTM data model?
A data warehouse (Snowflake, BigQuery, Redshift) serves as the aggregation layer where data from multiple systems is combined into a single, queryable dataset. For GTM analytics, the warehouse pulls data from the CRM, billing system, marketing automation platform, and product analytics tool, and makes it available for cross-system reporting. A warehouse is typically justified at $10M+ ARR or when cross-system reporting needs exceed what CRM-native reports can provide.
How do you handle attribution data in the GTM data model?
Attribution data requires three fields at minimum on every lead and contact record: First Touch Source (set once, never overwritten), Last Touch Source (updated with each new marketing interaction), and Original Source Detail (UTM parameters from the first visit). Multi-touch attribution models require storing every marketing interaction as a separate event record associated with the contact, not just overwriting the single source field.

Related Posts