How Real-World Data Validates Clinical Trial Assumptions

Executive Summary
Real-world data (RWD) – health-related data collected outside tightly controlled clinical trials – are increasingly recognized as essential for validating the assumptions underlying clinical trials and economic models in the post-market setting. Historically, regulatory and reimbursement decisions relied almost entirely on randomized controlled trials (RCTs). However, trial results often leave unanswered questions about how interventions perform in routine practice. RWD/real-world evidence (RWE) can fill these gaps by testing whether benefits, risks, and costs observed in trials translate to broader patient populations and longer time horizons. In the post-market or reimbursement context, payers and health technology assessment (HTA) bodies have begun to use RWD to confirm or recalibrate inputs to cost-effectiveness and budget-impact models, to provide external control arms, and to monitor real-world effectiveness and safety. Leading agencies – including the U.S. FDA, the European Medicines Agency (EMA), and HTA bodies like NICE – now encourage generation of RWE to close the efficacy–effectiveness gap between trial contexts and routine care ([1]) ([2]).
This review examines multiple perspectives and case studies on how RWD are used to validate clinical trial assumptions in a post-market economic setting. We first differentiate RWD and RWE from trial data and summarize regulatory/HTA frameworks (e.g. FDA’s RWE program, NICE’s RWE framework) emphasizing lifecycle evaluation of technologies. Next, we detail the principal ways RWD validate trial assumptions: by testing generalizability to broader patients (e.g. baseline risk, comorbidities, dosing/adherence patterns), by emulating control arms (especially for single-arm trials), and by extending observation beyond trial follow-up. We describe common RWD sources (electronic health records, claims, registries, patient surveys, etc.) and analytic methods (cohort matching, propensity scores, target-trial emulation) used to align real-world patients with trial-like cohorts. Throughout, we cite specific studies: for example, a claims-data study of COX-2 inhibitors used RWD to verify the real-world rates of gastroprotective agent use, overturning expert-elicited model assumptions and dramatically changing cost-effectiveness conclusions ([3]) ([2]). In another case, re-running a 5-year atrial fibrillation cost-effectiveness model with RWD (stroke and bleed outcomes from registries rather than the pivotal RELY trial) found that dabigatran was actually more cost-effective (even cost-saving) than initially estimated ([4]). We tabulate such examples (Table 1).
Key evidence confirms the growing reliance on RWD. A recent review of 64 UK NICE HTA submissions found RWD in ~11% of cases (often as external control arms or long-term outcomes) ([5]). A systematic review of CEAs observed steadily rising use of RWD in published cost-effectiveness models ([6]). Simultaneously, scholars stress the need for rigorous validation: health-economic models should check predictions against empirical real-world figures wherever possible ([7]) ([8]). The literature consistently notes limitations (confounding, data quality, selection bias) and calls for methodological care ([9]) ([10]).
Implications are profound. When RWD indicate that trial assumptions were optimistic or did not hold in practice, value assessments and reimbursement decisions may change. We discuss how payers use RWD in budget-impact analyses and outcome-based contracts: e.g. linking payment to RWE in outcomes-based agreements ([11]) ([12]). We also explore future trends: as precision medicine and adaptive approvals grow, lifecycle HTA frameworks and learning healthcare systems will increasingly rely on continuous RWE generation ([13]) ([14]). Emerging data sources (wearables, digital health) and advanced analytics (AI, federated data) promise richer RWD but also amplify privacy and bias concerns.
In conclusion, using RWD to validate clinical trial assumptions is now critical in health economics. Studies and guidelines agree that RWD should complement trial evidence by addressing real-world effectiveness, long-term outcomes, and population heterogeneity. However, careful study design and transparent reporting are imperative to ensure that post-market economic decisions are well-founded. We provide a detailed, evidence-based synthesis of current practices—highlighting concrete case studies, regulatory perspectives, and future directions—to guide stakeholders in leveraging RWD to strengthen a mature, patient-centered evidence base beyond the trial.
Introduction
Real-world data (RWD) and the real-world evidence (RWE) derived from it have emerged as vital complements to clinical trial data, especially for health economic evaluations conducted after a therapy enters the market. The U.S. FDA defines RWD as “data relating to patient health status and/or the delivery of health care routinely collected from a variety of sources” (e.g., electronic health records, claims) ([15]), and RWE as “clinical evidence about the usage and potential benefits or risks of a medical product derived from analysis of RWD” ([16]). NICE similarly emphasizes that RWD summarize outcomes of interventions in routine settings, outside tightly controlled trials ([17]). In practice, RWD include large databases of patient records (EHRs), insurance claims, disease registries, patient surveys, and increasingly digital devices; these sources can capture patient demographics, comorbidities, treatment patterns, resource use, costs, and outcomes (see Table 1).
RCTs remain the gold standard for establishing efficacy (does a drug work under ideal conditions). However, RCT participants are often highly selected and receive intensive monitoring; follow-up is typically shorter and limited. Therefore, there is frequently an efficacy–effectiveness gap: outcomes in the real world (effectiveness) may differ from the trial results ([1]). For example, patients in practice may have poorer adherence, more comorbidities, or different disease severity than trial enrollees. Economic models used in cost-effectiveness analyses (CEAs) or budget-impact forecasts rely on assumptions (input parameters) taken from RCTs or expert opinion. When these assumptions do not hold in real care, the models’ predictions of value or budgetary impact can be misleading.
In a post-market economic setting – referring to the phase after regulatory approval and during payer/HTA evaluation or healthcare delivery – it has become increasingly common to use RWD to validate or update the assumptions and projections made in the pre-approval phase. Health technology reassessments, price negotiations, and risk-sharing agreements all demand evidence that reflects real use.For instance, HTA bodies (NICE, CADTH, IQWiG, etc.) may require reassessment of a drug’s cost-effectiveness after several years on market; at that stage, RWD on actual utilization rates, long-term outcomes, and resource consumption can be incorporated into updated models ([18]) ([19]).
This report examines the multi-faceted role of RWD in confirming or challenging clinical trial assumptions within health economic analyses. We provide historical context on the rise of RWD use, describe key regulatory and payer initiatives, and detail methodological approaches. We illustrate with concrete cases where RWD data influenced value assessments (see Table 2). The aim is to give an in-depth, evidence-based synthesis for researchers, policymakers, and industry on current best practices, as well as ongoing challenges and future prospects, in integrating RWD to validate trial-based economic models.
Table 1. Common Real-World Data Sources and Uses. Key types of RWD and illustrative examples (not exhaustive).
| Source | Description | Example use for validating trial assumptions |
|---|---|---|
| Electronic Health Records (EHRs) – Data from routine clinical documentation (labs, diagnoses, prescriptions). | Example: IQVIA or Academic EHR data (e.g., UK Clinical Practice Research Datalink, US Epic, Taiwan National Health Insurance). | Estimate baseline event rates or treatment outcomes in a broad population, validate adherence rates, identify patient subgroups. |
| Administrative Claims / Billing Data – Insurance claims for services, procedures, and medications. | Example: US Medicare/Medicaid data, MarketScan, French SNDS. | Measure health care utilization, costs of care, and long-term resource use; assess real-world prescribing patterns and concomitant therapies ([3]). |
| Disease or Patient Registries – Systematic collection of data on patients with specific conditions. | Example: MS or cancer registries, diabetes registries (e.g., GLORIA-AF), national rare disease registries. | Provide large cohorts with long-term follow-up; can supply external control arms or long-term outcomes to supplement shorter trials ([20]) ([4]). |
| Surveys and Patient-Reported Data – Questionnaires, health surveys, or digital apps. | Example: PRO measures on quality-of-life, symptoms. | Validate assumptions about patient satisfaction or quality-adjusted life years (QALYs) beyond trial settings. |
| Digital Health / Wearables – Data from mobile health tools (activity trackers, apps). | Example: Continuous glucose monitors, step counters. | Offer granular measures of adherence or lifestyle factors to test assumptions made about behavior in RCTs. |
| Population Health Databases – Public health surveillance data. | Example: National health registries (death registries, census data). | Compare overall survival or mortality assumptions against actual population trends. |
Emerging sources (genomic databases, social determinants data, environmental sensors) also promise to enrich RWD. Importantly, while each data type has general strengths, the relevance to a given research question depends on data completeness and quality ([21]).
The Rise of Real-World Evidence: Context and Rationale
Historical Perspective
Traditionally, clinical trials underpinned all evidence for a drug’s benefit and harm. However, over the past two decades, the volume and variety of health data outside trials have exploded. Electronic records, automated billing, and big data analytics now facilitate retrospective and prospective RWE studies. In parallel, both regulators and payers have called for better post-approval evidence. The U.S. 21st Century Cures Act (2016) formally required FDA to evaluate using RWE to support new indications or satisfy post-market study requirements ([22]). In response, FDA launched its Real-World Evidence Program (2018) and committed to frameworks for approving label expansions or fulfilling commitments using RWD analyses ([22]) ([23]). Similarly, the European Medicines Agency has increasingly embraced RWE: for example, 165 regulatory cases (2013–2020) involved RWD in post-approval commitments ([24]), predominantly for safety (e.g. registry follow-up) ([25]).
Meanwhile, health technology assessment agencies have acknowledged that RCTs often leave critical gaps for real-world decision-making ([26]) ([1]). NICE’s five-year strategy explicitly aims to leverage RWD to “reduce uncertainties” and “drive access to innovations” ([27]). In 2022, NICE published a Real-World Evidence Framework to guide HTA submissions (identifying when RWD can reduce uncertainty, and outlining best practices) ([28]) ([29]). The international IMI GetReal initiative likewise highlighted RWD’s role in development and reimbursement decisions.
The Efficacy–Effectiveness Gap
RCTs ensure internal validity through tight control of patient selection and treatment delivery, but this often sacrifices external validity (generalizability). Common trial assumptions may not reflect reality:
- Population differences. Trials may exclude elderly patients or those with multiple comorbidities. RWD can show how the intervention performs in a broader, more heterogeneous population ([30]) ([10]).
- Adherence and persistence. In RCTs, adherence may be high due to monitoring, whereas in real care adherence can drop. If an economic model assumes trial-level adherence, RWD (e.g. refill rates from claims) can test that assumption ([3]).
- Comparators and treatment patterns. Trials often compare against placebo or an older therapy. RWD can identify current standard-of-care comparators and concomitant treatments that affect outcomes.
- Outcomes measurement. Trial endpoints (e.g. composite surrogate outcomes) may not capture all meaningful effects. RWD may include broader outcomes (hospitalizations, disability) or longer follow-up on mortality or safety ([4]) ([1]).
- Health care utilization. RCTs rarely collect detailed cost or resource-use data. Claims data can provide actual hospitalization rates, medication costs, and long-term economic burden required for budget models.
Crucially, the combination of RWD with trial data enables validation and recalibration of economic models (e.g. Markov models, budget-impact calculations). A health-economic model is expected to represent the real healthcare system adequately ([7]). This includes checking that model outputs align with real-world data – essentially an element of external validation ([31]) ([1]). Yet, a recent review found model validation seldom reported: only ~2.4% of economic models mention “validation” steps ([32]). Using RWD for validation (sometimes called “policy or external calibration”) has therefore become a recommended but under-implemented practice ([32]).
Healthcare decision-makers now often require this. For example, NICE and other HTA bodies have asked manufacturers for external control arms using RWD when clinical trial comparators were lacking ([20]). Similarly, in many managed entry agreements, payers stipulate collecting real-world outcomes post-launch to verify cost-effectiveness assumptions. These trends underscore that RWD is no longer “nice-to-have” but a fundamental part of post-market evidence generation.
Regulatory and Payer Frameworks for Real-World Evidence
Regulators and payers have established formal guidance on RWD use, reflecting its rising importance:
-
FDA (U.S.): As noted, FDA’s RWE Program encourages using RWD to support new indications and post-market commitments ([22]). The FDA website emphasizes RWD’s historical role in safety surveillance and its increasing role in effectiveness evaluations ([33]). Offices like CBER/CDER and the Oncology Center of Excellence have specific RWE pages. Most importantly, FDA’s 2018 framework concluded that “fit-for-purpose” RWD can generate RWE to advance product development and strengthen regulatory oversight across the lifecycle ([34]). FDA has already approved label expansions based on RWE studies (e.g. oncology indications) and supports patient registries to confirm benefits.
-
EMA and EU: EMA similarly acknowledges RWE’s role. An EMA report found that RWD was used increasingly in post-approval surveillance over 2007–2020, including registry studies to confirm safety or effectiveness post-launch ([25]). New EU regulations and the Innovative Medicines Initiative promote registry and EHR data for pharmacovigilance and HTA inputs. Notably, the EMA’s Adaptive Pathways concept and PRIME scheme incentivize mirror programs where RWE supplements shorter trials ([1]). For example, conditional approvals in Europe may require post-market observational studies as commitments.
-
HTA Agencies / Payers: Agencies like NICE (UK), CADTH (Canada), IQWiG (Germany), HAS (France), and others are explicitly integrating RWD. NICE’s 2022 RWE framework provides clear guidelines on when and how to include RWD ([35]) ([29]). Some HTAs now routinely accept RWD for rare diseases or single-arm trials where RCT comparators are missing ([20]) ([1]). Many agencies also allow or require coverage with evidence development (CED): provisional reimbursement while RWD are gathered. For instance, in France a drug must undergo reassessment after 5 years of reimbursement ([18]). Health economists are also exploring value of information methods to determine what RWD would be most informative for decision-makers.
-
Managed Entry Agreements and Risk-Sharing: Globally, payers increasingly link reimbursement or pricing to RWE. Outcomes-based contracts (risk-sharing agreements) tie payment to real-world performance of a drug. Surveys of US and EU experts indicate that outcomes-based contracts will expand notably in coming years ([11]) ([12]). Both manufacturers and payers see RWE as enabling better alignment of payment with actual patient benefit: payers focus on outcome improvement and cost-risk mitigation, while manufacturers seek broader access through conditional contracts ([12]). Similarly, CED schemes specifically require RWD collection (e.g., registries) to monitor long-term impact, especially in personalized medicine and gene therapies (where upfront costs are high and evidence sparse).
In summary, the regulatory and payer landscape now actively accommodates RWD. Explicit frameworks have begun to define how RWD should be collected, analyzed, and validated for decisions ([28]) ([36]). These policies reflect a consensus that post-market RWE is crucial to verify trial-based assumptions about safety, effectiveness, and value.
How RWD Validates Clinical Trial Assumptions
RWD can test many of the key assumptions and extrapolations that underpin clinical trials and economic models:
External Validity: Patient Populations and Risks
RCT assumptions about the patient population and baseline risk are often challenged by RWD. For example, trials may exclude older adults, people with comorbidities, or minority groups. RWD from insurance databases or registries encompass these broader populations. Analysts can compare demographic and clinical characteristics across trial and real-world cohorts. The diabetic kidney disease study illustrates this: comparing 5,734 trial patients to 23,523 registry patients, researchers found marked differences in age, diagnosis patterns, lab data and follow-up intensity ([8]). These differences implied that naively mixing trial and RWD could be flawed; it highlighted the need to validate compatibility before external controls are used ([8]) ([37]). In practical terms, if RWD shows that baseline event rates (e.g. stroke, hospitalization) are higher or lower than assumed from the trial control arm, the economic model’s predicted outcomes or costs must be adjusted.
RWD thus provides empirical baseline incidence rates for patients in routine care. For example, a CEA might assume a certain rate of disease progression derived from a trial. If RWD cohort analyses show a markedly different progression rate, this would prompt recalculating model outputs. In a cost-effectiveness model comparing dabigatran vs warfarin, researchers re-ran their analysis using real cohort event rates (from observational studies) instead of the trial data ([38]). They found that dabigatran’s stroke prevention was even greater in practice, turning an €8,000/QALY result into a scenario where dabigatran was dominant (more effective and cost-saving) ([4]).
Comparative Effectiveness and Control Arms
When head-to-head RCTs are absent or ethical constraints preclude randomization, RWD can serve as an external control. For instance, single-arm trials (common in oncology or rare diseases) generate no concurrent control group. Payers often question the assumption that the single-arm results are valid without a comparator. To address this, manufacturers or HTAs may construct a synthetic control arm using registry/EHR patients matched to the trial cohort on key criteria ([39]) ([40]). Techniques like propensity score matching emulate the randomization process as much as possible. The credibility of this approach rests on validating that the RWD patients truly resemble the trial cohort (cf. the diabetic kidney example) and that important confounders are accounted for ([41]) ([10]).
In the NICE review of submissions, RWD was primarily used to enable comparisons for single-arm trials and to inform extrapolation of long-term survival beyond trial follow-up ([39]). However, roughly one-third of these comparisons were still “naïve” (unadjusted) – a methodological concern ([42]). Best practice is to adjust for covariates; governance bodies now recommend target-trial frameworks to design RWD studies, which improve transparency and trust ([43]) ([44]).
A striking real-world example comes from diabetes research: before the CAROLINA trial results were published (linagliptin vs glimepiride), an RWE study emulated the trial protocol in an observational cohort. It predicted the trial’s eventual primary outcome (no difference in cardiovascular events) and its secondary finding (linagliptin halved severe hypoglycemia rates) ([45]). This success – aligning a hypothetical RWD comparison with a later RCT – illustrates how carefully designed RWD studies can validate trial outcomes ([45]).
Long-Term Outcomes and Extrapolation
Clinical trials often have limited durations, whereas chronic diseases require lifetime models. RWD offer longitudinal follow-up. For example, a trial might observe survival for 2 years, but policymakers want to know 10-year survival. RWD from registries or claims can supply intermediate follow-up data to calibrate survival curves or disease progression models beyond the trial period. NICE identified that about 10% of HTA submissions used RWD specifically for “long-term treatment effects when extrapolating survival data beyond trial follow-up” ([46]).
In practice, this could involve fitting a parametric survival model to real-world registry data after trial completion and comparing it to the modeled extrapolation. Any significant divergence (e.g. slower real-world mortality) would lead to revising quality-adjusted life year (QALY) gains or costs. Institutes like the ISPOR Good Research Practices for Observational Studies emphasize using appropriate RWD for extrapolations and checking consistency with trial trends.
Adherence, Persistence, and Treatment Patterns
Trials assume ideal use: patients adhere perfectly and providers follow protocol. In the real world, discontinuation rates, dosing changes, and polypharmacy are common. RWD track prescriptions and refills over time. For example, instead of an assumed 100% adherence, claims data may show only 80% of patients continue the drug after 1 year; models can incorporate this “real-world effectiveness” (which is often lower than trial efficacy).
The AJMC study on cyclo-oxygenase-2 (COX-2) inhibitors is illustrative ([3]). Early cost-effectiveness models assumed that patients on COX-2s did not receive gastroprotective agents (GPAs), underestimating costs. By analyzing a large claims database, researchers found around 22% of new COX-2 users did receive a GPA, versus 15% for nonselective NSAIDs ([3]). This real-world usage contradicted expert opinion. Incorporating the RWD-based GPA rate into the decision model raised the incremental cost-effectiveness of COX-2s from ~$18,600 per life-year (with the assumption) to over $100,000, dramatically altering value conclusions ([3]) ([2]). Thus, RWD validated (and effectively invalidated) the prior assumption, showing that careful measurement of actual practice patterns is crucial for credible models.
Costs and Resource Use
Economic models require cost data – hospitalizations, procedures, medications – which are not fully observed in RCTs. RWD (especially claims) provide real patient-level cost information. Analysts can compare assumed costs (from tariffs or small costing studies) with mean costs observed in matched RWD cohorts. For example, if an RCT was conducted in a center with unusually low complications, the model might underestimate future hospital costs; RWD can expose this.
Additionally, RWD track healthcare resource utilisation. If a trial drug leads to fewer events, models assume offset costs; RWD can confirm whether those offsets materialize in practice (e.g. reduced ER visits or rehab stays). A mismatch would prompt redoing budget impact models. As one systematic review notes, RWE can inform “resource use, long-term natural history, and effectiveness” beyond what trials provide ([9]). However, it cautions analysts to handle biases (confounding, missing data) inherent in observational cost data ([47]).
Empirical Studies and Model Validation
On a higher level, health economic models themselves should be checked against RWD predictions. External validation involves comparing model outputs for key endpoints (e.g. cumulative incidence of events, mortality over time) to those observed in RWE cohorts or in post-launch registries. For instance, investigators might simulate a model cohort beyond the trial, then compare that to actual registry survival: significant divergence would trigger model re-specification.
While rarely done formally, this approach is recommended in modelling guidelines ([31]) ([48]). Recent literature agrees that validating model input data and conceptual assumptions with RWD improves confidence in model outputs ([31]) ([48]). An encouraging trend is emerging: newer HTA submissions more frequently use RWD and document adjustments, though reviewers note that many still neglect rigorous adjustment ([42]) ([32]). Overall, the evidence suggests that where RWD have been applied, they often strengthen the evidence base for HTA decisions, provided methodological rigor is maintained.
Data Sources and Methodological Considerations
Types and Quality of RWD
RWD encompass a spectrum of real-world observations. Each type has distinct strengths and limitations:
-
Electronic Health Records (EHRs). Detailed clinical information (vitals, labs, diagnoses) is available, enabling nuanced patient characterization. However, EHR data may lack standardized outcome measures or full medication compliance. EHRs are highly suited to validating clinical parameters (e.g. lab response, vital signs) and staging disease. NICE notes EHRs often integrate lab and imaging systems, making them rich sources of patient data ([49]).
-
Administrative Claims. These capture billing for services with comprehensive records of hospitalizations, procedures, and prescriptions. Claims data excel at measuring utilization and direct costs, but generally lack clinical detail (e.g. lab values, over-the-counter meds). They are ideal for verifying assumptions about healthcare resource use. For example, if a trial assumed 2 hospitalizations/year for a condition, claims can reveal the actual rate.
-
Registries. Disease-specific or drug registries gather data prospectively. They often have consistent outcome tracking (e.g. cancer staging, functional scales) and larger sample sizes. Registries are valuable external controls or for long-term follow-up. However, registry inclusion can be selective (centers of excellence) and data completeness varies.
-
Patient-Reported Data. Registries and surveys provide quality-of-life, symptom, or functional outcomes. These help validate assumptions about health utilities or patient preferences used in QALY calculations.
-
Digital and Wearable Data. Emerging sources (health apps, wearable monitors) can capture continuous data (physical activity, glucose levels). These are promising for adherence and behavior, but are still relatively novel in HTA contexts.
Quality can vary widely. As noted in comparative analyses, RWD often show missingness, measurement error, and irregular sampling intervals ([8]) ([50]). Selection bias is inherent (e.g. why patients get included in a registry), and coding inaccuracies can misclassify exposures or outcomes. Hence, validating RWD content (e.g. cross-checking EHR entries with chart review) is advised. The bottom line: only well-understood, fit-for-purpose RWD should be used to validate critical model assumptions ([34]) ([44]).
Analytical Methods
Using RWD to emulate trial assumptions requires careful study design and statistical adjustment:
-
Target Trial Emulation. Researchers conceptualize the RWD analysis as mirroring an ideal randomized trial (“target trial”) ([44]). They define clear eligibility criteria, treatment strategies, and outcomes as if running a de novo trial, then extract cohorts accordingly. This approach guides the choice of inclusion/exclusion and analytic timeline. For example, in the CAROLINA emulation, the RWE study team closely replicated the trial’s exclusion rules to select new users of linagliptin vs glimepiride ([45]).
-
New-User, Active Comparator Design. A key principle is to compare new initiators of one therapy to new initiators of another (active comparator) with similar clinical profiles. This avoids immortal time bias and weeds out patients already stabilized on a drug. Schneeweiss and Patorno highlight the value of this design in RWE studies ([51]), and applied it successfully in diabetes drug comparisons.
-
Propensity Score and Matching. To adjust for confounding, propensity score methods (matching, weighting, stratification) are widely used ([52]). These match patients on observed covariates to mimic balanced groups. For validating trial assumptions, one might match the RWD cohort to the trial population, or vice versa, and then compare outcomes. In the diabetic kidney example, machine-learning clustering and matching were suggested to align patients ([41]). While not eliminating unmeasured confounding, good covariate balance increases confidence in comparisons.
-
Sensitivity Analyses. Because unmeasured confounding always lurks, analyses should test robustness across assumptions. This includes varying adjustment models, using different RWD subsets, or negative-control outcomes. RWE guidelines stress full transparency (publishing protocols/analysis code where possible) to allow scrutiny and reproducibility ([52]).
-
Data Linkage. Sometimes combining sources (e.g. linking EHR to claims) can enrich data for validation. For instance, linking clinical labs to cost data could allow modelers to assess how trial-level lab changes translate into downstream cost changes in practice. Such linkages require privacy safeguards but can greatly improve validation.
Critical to all these methods is acknowledging limitations. Observational RWD cannot “prove” assumptions in the same way randomized evidence can. Instead, they can support or challenge model inputs. Systematic reviews note that while RWE is valuable for long-term pathways and resource use, inherent biases must be carefully mitigated ([53]) ([10]). Inadequate adjustment (as seen in a third of NICE submissions) can lead to misleading validation. Thus, following recognized best practices and guidelines (e.g. ISPOR’s guidance on RWE for HTA) is essential.
Case Studies: Real-World Data Validating (or Revising) Trial Assumptions
Concrete examples illustrate how RWD can confirm or alter expectations from trials:
-
COX-2 Inhibitor Gastroprotective Use (AJMC 2003) ([3]) ([2]): Early cost-effectiveness models for COX-2 painkillers assumed that prescribers would not co-prescribe acid suppressants (GPAs) with the safer COX-2s, under the belief that GPAs were only needed to counteract ulcer risk from older NSAIDs. Researchers examined insurance claims data (N≈319,000) to check this assumption. Contrary to expert opinion, they found higher GPA use among COX-2 users (≈22%) than among nonselective NSAID users (≈15%) ([3]). When the model was re-estimated with these real-world rates, the incremental cost-effectiveness ratio (cost per life-year saved) for COX-2s rose dramatically – from ~$18,600 to >$100,000 ([3]). In other words, real-world practice (patients on COX-2s still used GPAs) dramatically changed the drug’s economic value. This study explicitly demonstrated the danger of relying on unchecked assumptions and showcased RWD’s role in validating utilization patterns.
-
Dabigatran vs. Warfarin (European J Health Econ 2020) ([4]): Using identical cost-effectiveness models, researchers compared two scenarios: one using RCT data (from the RELY trial) and standard cost inputs, and one using RWD and registry data from routine care. The RCT-based analysis estimated dabigatran’s ICER at ~€8,100–€13,100 per QALY (for 150 mg and 110 mg doses) ([4]). However, in the RWD-based analysis (with outcomes from observational studies and real costs), dabigatran became cost-saving and more effective for both doses ([4]). This means that in practice, dabigatran prevented more strokes and bleeds (and at lower net cost) than originally predicted by trials. The authors concluded that using post-launch data “improved the efficiency of dabigatran” ([54]), and they warned that future HTAs should consider potential methodological issues when RWE is incorporated. This example shows how RWD can validate (and even amplify) trial findings in favor of a drug, thereby reinforcing reimbursement decisions.
-
Predicted CAROLINA Trial Results (Diabetes) ([45]): This RWE study focused on linagliptin vs glimepiride in type 2 diabetes, closely mirroring the design of the CAROLINA trial. By mimicking the trial’s inclusion criteria in a real-world cohort, investigators predicted months ahead of time that there would be no significant difference in major cardiovascular events between the drugs, consistent with CAROLINA’s eventual published results ([45]). Moreover, the RWE forecasted a substantial reduction in severe hypoglycemia with linagliptin, exactly as seen in the trial. This successful emulation suggests that when meticulously designed, RWD studies can validate trial outcomes. It provides confidence that some trial results are reproducible in practice, supporting the relevant model assumptions (e.g. similar CV effect, better safety for linagliptin in this case).
-
Validation of RCT vs Real-World Cohorts (Diabetic Kidney Disease) ([8]) ([37]): A scientific report compared an RCT dataset (5,734 patients) and an EHR-derived cohort (23,523 patients) of diabetic kidney disease. They found significant differences in patient mix, data completeness, and follow-up patterns ([8]). Critically, some patient subgroups overlapped between data sources, but others did not. The authors stressed that one must validate the compatibility of RCT and RWD before combining them (e.g. for external control arms) ([37]). They concluded that thoughtful matching and advanced methods are needed to mitigate these dataset disparities. While not a pharmacoeconomics example per se, this study underscores that before RWD is used to validate assumptions (like risk profiles), researchers must ensure the RWD are “fit-for-purpose” ([8]) ([37]).
-
NICE HTA Submissions (UK) ([5]) ([55]): A recent review of NICE technology appraisals (2016–2023) quantified RWD use in real-world practice. Out of all submissions, 64 (≈11%) incorporated RWD for estimating treatment effects ([56]). These data typically came from registries or EHRs, and were mainly used for external controls or to inform long-term survival extrapolations ([57]). Importantly, the review found that about one-third of submissions still relied on unadjusted RWD comparisons, potentially biasing results ([58]). The authors recommended strict adherence to guidelines (like NICE’s RWE framework) for transparency. This example highlights both the adoption of RWD in economic submissions and the methodological pitfalls to watch out for when using RWD to validate assumptions.
These cases illustrate the spectrum of RWD validation: they can confirm model inputs (e.g. cardiovascular results), reveal hidden practice patterns (COX-2 example), or show where RCT assumptions may fail (populations differences). A summary of such cases is given in Table 2, showing the concrete impact on economic outcomes.
Table 2. Case Studies: Real-World Data Validating Trial-Based Assumptions. Each row describes how RWD were used to test a specific assumption.
| Context / Intervention | Trial Assumption or Model Input | RWD Findings (Validation) | Impact on Economic Assessment / Decision | Reference |
|---|---|---|---|---|
| COX-2 inhibitors (arthritis pain) | COX-2 users do not require gastroprotective agents (GPA) – i.e., GPA use=0% | Insurance claims data showed 20–22% of COX-2 new users did use GPAs (vs ~15% for other NSAIDs) ([3]). | Re-estimating the cost-effectiveness model with real GPA rates raised COX-2 ICER from ~$18,600 to >$100,000 per life-year ([3]). Validated model must use RWD-based GPA rates; calls for re-evaluation. | Cox et al. (2003) ([3]) ([2]) |
| Dabigatran vs Warfarin (AFib) | Trial event rates (RELY trial) and expert cost inputs | RWD (registry studies) showed more favorable outcomes with dabigatran (fewer strokes/bleeds than in trial), and actual cost data. | In an “ex post” analysis, dabigatran became cost-saving (more effective at lower cost) versus warfarin ([4]). RWD inputs improved realized cost-effectiveness. | Ricciardi et al. (2020) ([4]) ([54]) |
| Linagliptin vs Glimepiride (diabetes; CAROLINA emulation) | No difference in CV events; less hypoglycemia with linagliptin (based on trial design) | RWD study (target-trial emulation) predicted exactly no CV difference and substantially lower hypoglycemia for linagliptin ([45]). | Validated that trial findings are reproducible in practice, supporting model assumptions on CV outcomes and safety. | Schneeweiss & Patorno (2021) ([45]) |
| Diabetic Kidney Disease: RCT vs RWD datasets | Assumed trial and RWD populations are comparable | Comparison showed significant differences (age, labs, follow-up) and only partial overlap of subgroups ([8]) ([37]). | Highlighted need to adjust for differences when using RWD (e.g. external controls); suggested RWD can enrich trial data if matched appropriately. | Kurki et al. (2024) ([8]) ([37]) |
| NICE Technology Appraisals (UK HTA) | In some submissions, single-arm trial needed comparator; models used standard extrapolation | 64 apps (11%) used RWD (registries/EHR) mainly for external controls or survival extrapolation ([5]). However, ~33% used unadjusted comparisons ([58]). | Where adjusted, RWD enabled otherwise-impossible comparisons; where naïve, risk of bias identified. Led to recommendations to follow RWE framework in future. | Che et al. (2024) ([5]) ([55]) |
Challenges and Limitations
While RWD offer rich opportunities, using them to validate trial assumptions comes with caveats:
-
Bias and Confounding. Unlike RCTs, RWD studies lack randomization. Patient selection and unmeasured factors can distort comparisons. Observational studies must address this (see above) but cannot fully eliminate unobserved confounders. Reviewers note that bias from confounding is the chief limitation of RWD in modeling ([53]). Immortal time bias and reverse causation are particular risks if treatment timing is not handled properly ([59]).
-
Data Quality and Completeness. RWD completeness varies by variable. Claims capture billed services but not over-the-counter meds or lab results. EHR data can have missing values (e.g. lab tests not administered uniformly). Misclassification can occur in coded fields. These issues affect the variables used in model inputs (e.g. disease severity). Analysts often need imputation or sensitivity analysis for missing data ([50]) ([53]).
-
Generalizability of RWD Source. A registry or claims database may not represent all practice settings (e.g. academic centers vs community clinics). Therefore, care is needed before extrapolating any finding to the entire population. In Table 1 we listed common sources – each has typical biases (e.g. insured populations for claims data).
-
Timing and Recency. For validating current trials, RWD must be contemporary. Using outdated RWD can lead to errors, especially if standards of care have changed. Similarly, linkage delays and data lag can hamper timely validation. On the other hand, prospective RWD collection (as in CED schemes) can mitigate this but requires infrastructure and time.
-
Transparency and Reproducibility. Unlike published trials, many RWD analyses are not peer-reviewed or fully documented. The NICE review lamented that many submissions did not fully describe RWD methods ([60]). There is a movement for open-source code and protocols (e.g. through platforms like OHDSI). Adherence to reporting standards (STaRT-RWE, RECORD-PE) is recommended.
-
Statistical Uncertainty. RWD cohorts can often be large, but stratifying or matching can reduce effective sample size, leading to wide confidence intervals. Economic models based on uncertain RWD inputs should account for this (e.g. with probabilistic sensitivity analysis).
Despite these limitations, almost all investigators agree RWD’s value outweighs its drawbacks when used appropriately. Guidelines explicitly state that RWD should be used “when RCT evidence is incomplete or insufficient” ([1]) ([9]). The key is recognizing RWE as supplementary evidence: it informs assumptions and may confirm trial-based inferences, but rarely completely supplants the need for randomized data.
Implications for Health Economics and Policy
The integration of RWD into post-market economic assessments has several important implications:
-
More Accurate Cost-Effectiveness: As seen in the case studies, using RWD often changes CEA results. Unquestioningly trusting trial inputs can either overstate or understate value. For example, if real-world effectiveness is lower than trial efficacy, an HTA body might renegotiate price or impose usage restrictions. Conversely, better-than-expected outcomes (as with dabigatran) can lead to stronger coverage recommendations.
-
Adaptive Reimbursement and Price Negotiation: Payers are recognizing that price and reimbursement decisions should be dynamic. RWD enables reassessment of price/cost-effectiveness over time. In outcome-based contracts, if RWD shows a drug underperforms (or overperforms), payments are adjusted accordingly. This aligns incentives but requires robust RWE tracking.
-
Resource Allocation: RWD-driven validation helps ensure healthcare budgets target therapies that truly work in practice. Especially for expensive specialty drugs (oncology, gene therapies), confirming real-world benefit protects public funds and guides rational resource use ([1]) ([14]).
-
Encouraging Pragmatic Research: The heightened focus on RWD is influencing how new clinical trials are designed. There is a trend toward hybrid designs (e.g. pragmatic trials embedded in health systems) that transition smoothly to real-world follow-up. Early in drug development, sponsors now often plan for RWE studies post-approval to address HTA concerns, aligning development with a “life-cycle” perspective ([13]) ([61]).
-
Global Health Technology Assessment: Different countries vary in how they use RWD. The seven-country review noted variation: some require local registry data, others allow international evidence in certain contexts ([36]). Nevertheless, all face common challenges (data gaps, methodological uncertainty) and are moving toward more explicit RWE guidelines ([36]). Harmonizing standards is an ongoing policy effort (ISPOR, HTA network initiatives).
-
Patient Impact: Ultimately, the goal is better patient outcomes. By validating trial assumptions, RWD can uncover subgroups who benefit most (or least), leading to more tailored use. Post-market evidence generation may identify safety signals earlier or confirm long-term benefit persistence. Engaging patients in registry data collection also reflects real patient experiences.
Future Directions
The role of RWD in validating trial assumptions will continue expanding, driven by technological and policy trends:
-
Data Innovation: Advances in health informatics (structured EHRs, common data models) will improve RWD quality and linkage. Artificial intelligence and machine learning can handle unstructured data (e.g. extracting outcomes from clinical notes) and identify hidden confounders ([10]). The EU project “Real4Reg” and initiatives like OMOP/OHDSI are unlocking more healthcare data across borders ([62]).
-
Learning Health Systems: The concept of a learning healthcare system – where care delivery continually generates data to improve practice – will fuel RWD. Precision oncology, for example, relies on continuous genomic and outcome data to adapt treatments; a “life-cycle HTA” approach has been proposed where reassessments are built into policy ([13]).
-
Regulatory Science: Regulatory agencies are exploring formal frameworks for RWD use. CIA reports and FDA guidance (21st Cures, PDUFA VI commitments) encourage inclusion of RWD in label expansions and post-market studies. In drug approvals, a mixture of trial and RWE data (especially for rare diseases) is expected to increase. The recently established CIOMS Working Group is set to produce further guidance on RWD in regulation ([63]).
-
Global Data Collaboration: To validate assumptions more robustly, international data pooling may become more common, albeit with privacy considerations. Federated learning (analyzing data where it resides) and distributed data networks could allow multi-center RWE studies without sharing raw data.
-
Policy Instruments: Policy tools like managed entry agreements, conditional approvals, and re-negotiation clauses will increasingly include RWD clauses. However, to sustain this, methodological standards (how to measure outcomes, handle biases) must be agreed internationally. The seven-country review concluded that guidelines for RWD design and acceptance are urgently needed across HTA processes ([36]).
-
Ethical and Equity Considerations: As RWD become integral, issues of patient privacy, data ownership, and representation arise. Ensuring that RWD studies do not amplify healthcare disparities is crucial: for example, if RWD mainly capture insured, urban populations, model validations might not apply to underserved groups. Future work must ensure inclusive real-world cohorts.
Discussion
The evidence indicates that RWD is a powerful tool for validating and refining the assumptions of clinical trials in real-world settings. Real-world cohorts often corroborate — but sometimes diverge from — the highly controlled results of RCTs, and these differences can have material impacts on economic evaluations. When RWD align with RCT findings (as in the CAROLINA case) it bolsters confidence that the trial-based models are valid. When RWD differ, they highlight where models need updating (as with COX-2 GPA use).
Crucially, this process enhances transparency. Rather than accepting model assumptions on faith or expert opinion alone, RWD demand that those assumptions be evidence-based. This scientific rigor is increasingly expected by regulators and payers. Moreover, the existence of multiple perspectives (regulators focusing on safety/effectiveness, HTAs on value, manufacturers on market access) means RWD attract attention from all stakeholders.
However, the literature underscores remaining gaps. Reviews call for better implementation of RWD methods, including target-trial designs and bias correction ([55]) ([53]). Without these, RWD studies risk perpetuating errors. In fact, one key lesson is that validation itself must be validated. The process of using RWD to check a model involves its own study design — which must withstand scrutiny. For example, if an RWD analysis uses the same data to both build and test a model, “double-dipping” errors can occur.
Interdisciplinary collaboration is needed: data scientists, epidemiologists, health economists, and clinicians must jointly design RWD validation studies. Standardization is emerging: NICE, ISPOR, and international bodies offer frameworks for study quality and reporting. Faster adoption of these best practices will benefit all.
Finally, the ongoing collection of RWD means that model validation is no longer a one-time task. With continuous monitoring (e.g. through registries or EHR networks), assumptions can be repeatedly tested as practice evolves. For instance, if a new competing therapy enters the market, RWE can detect changes in baseline risk or effectiveness over time, prompting model updates. This dynamic, iterative process — a “learn and confirm” cycle ([64]) — may become the norm.
Conclusion
Real-world data have transformed how we understand healthcare interventions beyond the trial context. In health economics, RWD serve as a reality check on the assumptions and extrapolations built into early economic models. By illuminating true patient populations, treatment patterns, and long-term outcomes, RWD-based validation can either corroborate or overturn the findings of pre-market analyses. This deeper scrutiny leads to more reliable value assessments and, importantly, more informed decisions on pricing, reimbursement, and clinical use.
We have surveyed a broad range of evidence: regulatory guidelines, systematic reviews, and concrete case examples. The consistent message is that RWD are a necessary complement to RCT data in the post-market setting. Neither data source alone suffices: trials establish efficacy under ideal conditions, while RWE verifies that these conditions and results hold in practice. For stakeholders making coverage decisions, patient access determinations, or therapeutic guidelines, RWD-driven validation mitigates the risk that trial-based optimism or assumptions go unchecked.
Looking forward, the integration of RWD will only deepen. As healthcare becomes more digital and personalized, new forms of RWD will emerge, demanding novel validation strategies (e.g. for genetic therapies or digital therapeutics). Policymakers and researchers must continue refining methodologies and governance to ensure RWD fulfill their promise. Ultimately, the goal is a learning healthcare system where evidence flows bidirectionally: RCTs inform models, and RWD continuously update and validate them. This synergy will yield economic evaluations and healthcare decisions that truly reflect the real-world impact of medical interventions.
References (select): All claims above are supported by peer-reviewed literature. For example, NICE defines RWD and its decision-making uses ([17]) ([39]); regulatory agencies emphasize RWE for lifecycle evaluation ([33]) ([1]); methodological reviews highlight RWD’s role and limitations in economic modeling ([9]) ([31]); and case studies illustrate the concrete impact on cost-effectiveness results ([3]) ([4]) ([45]). (For brevity, a full reference list is omitted here; inline citations point to the relevant sources.)
External Sources (64)
DISCLAIMER
The information contained in this document is provided for educational and informational purposes only. We make no representations or warranties of any kind, express or implied, about the completeness, accuracy, reliability, suitability, or availability of the information contained herein. Any reliance you place on such information is strictly at your own risk. In no event will IntuitionLabs.ai or its representatives be liable for any loss or damage including without limitation, indirect or consequential loss or damage, or any loss or damage whatsoever arising from the use of information presented in this document. This document may contain content generated with the assistance of artificial intelligence technologies. AI-generated content may contain errors, omissions, or inaccuracies. Readers are advised to independently verify any critical information before acting upon it. All product names, logos, brands, trademarks, and registered trademarks mentioned in this document are the property of their respective owners. All company, product, and service names used in this document are for identification purposes only. Use of these names, logos, trademarks, and brands does not imply endorsement by the respective trademark holders. IntuitionLabs.ai is an AI software development company specializing in helping life-science companies implement and leverage artificial intelligence solutions. Founded in 2023 by Adrien Laurent and based in San Jose, California. This document does not constitute professional or legal advice. For specific guidance related to your business needs, please consult with appropriate qualified professionals.
Related Articles

QALYs & Quality of Life: Justifying Specialty Drug Costs
Learn how patient quality of life (QoL) data is quantified into Quality-Adjusted Life Years (QALYs) to perform cost-effectiveness analysis for specialty drugs.

Health Economic Data for Drug Formulary & Reimbursement
Learn how health economic data like cost-effectiveness analysis is used to secure drug formulary placement and reimbursement from payers and HTA bodies.

Gene Therapy Pricing: The Economics of Million-Dollar Cures
Learn the economic principles behind million-dollar gene therapy pricing. This analysis compares one-time cure costs to lifetime chronic care using value-based