Back to ArticlesBy Adrien Laurent

MedDRA & WHODrug: A Guide to Coding in Clinical Trials

Executive Summary

In global clinical trials, standardized coding of adverse events (AEs) and concomitant medications is essential for patient safety monitoring, data consistency, and regulatory compliance. The two primary coding dictionaries used in these contexts are MedDRA (Medical Dictionary for Regulatory Activities) for adverse events and WHODrug for medications. MedDRA, an ICH-endorsed hierarchical medical terminology, was developed in the 1990s to harmonize regulatory reporting globally and is now mandatory in regions such as the EU and Japan ([1]) ([2]). It contains tens of thousands of specific medical terms (over 78,000 Lowest Level Terms in version 26.1 ([3])) organized into a five-level hierarchy (System Organ Class (SOC), High-Level Group Term (HLGT), High-Level Term (HLT), Preferred Term (PT), and Lowest Level Term (LLT) ([4])). WHODrug, maintained by the Uppsala Monitoring Centre (Uppsala WHO Collaborating Centre), is a comprehensive global dictionary of medicinal products (nearly 500,000 unique names by 2019 ([5])) that links trade names to active substances via a structured drug code and the ATC classification system ([6]) ([7]).

Employing these dictionaries ensures that free-text entries from case report forms (CRFs) are coded consistently, enabling cross-trial aggregation, safety signal detection, and regulatory submissions. For example, each reported event verbatim is mapped to a MedDRA term ([8]), while each reported medication is mapped to a WHODrug entry ([9]) ([10]). Proper coding facilitates meaningful medical grouping; MedDRA’s hierarchical and multiaxial structure allows aggregation of related events (via PTs, HLTs, etc.), while WHODrug’s classification (by ATC and Standardised Drug Groupings) allows aggregation of related medications ([11]) ([4]). Both dictionaries are regularly updated (MedDRA biannually ([12]); WHODrug frequently to include new products ([5])) and widely adopted—WHODrug was used by over 2,300 organizations as of 2019 ([13]).

Despite these benefits, coding consistency can be challenging. Studies show inter-coder variability in assigning MedDRA terms: one review found that 12% of codes differed between two coders and 8% deviated from the original description ([14]). MedDRA’s granularity can mask events when codes are too specific ([15]), and complex or ambiguous verbatims can lead coders to select different (often related) terms ([3]) ([16]). For WHODrug, challenges include non-unique trade names across different formulations or countries, and multiple codes for one substance used in different indications ([15]). Technical advances (auto-coding tools, AI like WHODrug Koda ([17])) and standardized query tools (MedDRA SMQs) help mitigate these issues, but expert oversight remains crucial.

This report provides an in-depth review of MedDRA and WHODrug in the context of clinical trials. It covers their histories, structures, maintenance, standards adoption, and usage for coding AEs and medications. We analyze data on their content sizes, user communities, and integration into trial data systems. Multiple perspectives (regulatory guidelines, industry experience, academic studies) are presented, along with case studies (such as coding discrepancies in trials and the use of dictionaries in safety signal detection). We also consider implications for future practice, including harmonization efforts (e.g. mappings to other vocabularies), and emerging trends like multilingual versions and AI-assisted coding. All claims are supported by authoritative references from regulatory guidance, peer-reviewed research, and expert commentary.

Introduction

Background. In clinical trials and pharmacovigilance, adverse event (AE) reporting is fundamental for monitoring patient safety and for regulatory evaluation. Coding these events into standardized terms allows aggregation of data across patients, sites, and studies. Similarly, medications taken by patients (concomitant drugs, study drugs, rescue medications) must be coded to standard vocabularies to analyze drug usage and identify potential drug–drug interactions or medication-related adverse effects. Historically, various coding schemes were used (e.g. COSTART, WHO-ART, ICD) ([18]), leading to variability and difficulty in combining data. The pharmaceutical industry and regulatory bodies recognized the need for global standards. In 1994, a collaborative effort by major regulatory agencies (FDA, EMA, PMDA, etc.) and industry created the Medical Dictionary for Regulatory Activities (MedDRA) to standardize AE coding and electronic submissions globally ([1]) ([4]).

MedDRA is now the regulatory standard for pre- and post-marketing safety data in ICH regions ([1]) ([2]). It is updated twice yearly, and includes a vast range of medical concepts (disease, symptoms, procedures, etc.) ([19]) ([20]). WHODrug evolved from the World Health Organization’s international drug monitoring program, first established post-thalidomide (1968) to facilitate global pharmacovigilance ([21]) ([18]). WHODrug links medications to active ingredients via a structured code and classifies them by the WHO Anatomical Therapeutic Chemical (ATC) system ([22]) ([23]).It is similarly updated regularly and maintained by the Uppsala Monitoring Centre ([24]).

In modern clinical trials, data on AEs and medications are captured in electronic case report forms (eCRFs) and then coded. MedDRA codes adverse events: each reported term is mapped to the most precise (lowest-level) MedDRA term ([25]) ([26]). WHODrug codes medications: each drug verbatim is matched to a WHODrug record that includes the trade name, active moiety, form, strength, country, etc. ([27]) ([28]). This dual coding ensures that safety analyses and regulatory submissions use consistent, internationally-recognized terminology across all sites and countries in a trial ([29]) ([30]).

Purpose of report. This report examines MedDRA and WHODrug in the context of coding adverse events and medications in clinical trials. We cover:

  • Historical context: How and why these dictionaries were developed; organizational stewardship (ICH/MSSO for MedDRA; WHO/UMC for WHODrug) ([1]) ([24]).
  • Structure and content: The hierarchical design of MedDRA (levels from SOC to LLT) and WHODrug (Drug code, linking of trade names to ingredients, ATC classification, Standardised Drug Groupings) ([26]) ([31]).
  • Current usage: Regulatory requirements (e.g. ICH guidelines mandating MedDRA for safety reporting, PMDA requiring WHODrug for concomitant meds ([2]) ([32])), adoption in industry and CROs, and integration into EDC and safety databases.
  • Coding process: How coders use these dictionaries in practice (from CRF verbatim to coded terms), supported by guidelines (ICH points-to-consider, training materials) ([33]) ([3]).
  • Data analysis: How coded data are used for safety analysis—aggregation, signal detection, integrated listings—and how grouping structures (MedDRA’s hierarchy and SMQs; WHODrug’s classifications and SDGs) facilitate this ([34]) ([35]).
  • Challenges and inconsistencies: Empirical evidence on coding variability; potential misclassification or masking of events; difficulties with ambiguous or colloquial terms ([14]) ([36]). Practical issues like version updates (upcoding between dictionary versions) and organizational coding policies ([37]) ([38]).
  • Case studies: Examples from literature or regulatory review illustrating the impact of coding choices (e.g. antidepressant trial mis-coding, use of WHODrug to expand safety signal analysis) ([39]) ([40]).
  • Future directions: Trends such as automation (auto-coding algorithms ([41])), multilingual support (Chinese WHODrug ([42])), and harmonization (mapping between MedDRA, SNOMED, ICD) ([43]) ([4]).
  • Conclusion: Synthesis of findings to emphasize that although MedDRA and WHODrug are indispensable tools for trial safety data management, proper training, consistent practices, and ongoing improvements are needed to ensure data quality.

Throughout the report we provide extensive citations from regulatory guidance, scientific articles, and expert writings to support each claim.

MedDRA: Standardized Coding of Adverse Events

History and Stewardship

MedDRA (Medical Dictionary for Regulatory Activities) was created in 1994 by a joint effort of the European, Japanese and US drug regulatory authorities and industry (representatives of the ICH regions) ([1]). Its purpose was to replace disparate late-20th-century coding systems (COSTART, WHO-ART, ICD) and allow standardized electronic submission of safety data across the globe ([1]). The ICH endorsed MedDRA as the official adverse event terminology for harmonized reporting in pre- and post-marketing safety ([1]) ([44]). MedDRA’s maintenance and distribution are managed by the MedDRA Maintenance and Support Services Organization (MSSO, USA) and the Japanese Maintenance Organisation (JMO) ([45]). Both bodies ensure the dictionary remains up-to-date: new terms, modifications, and translations are considered in biannual releases ([45]) ([12]). The MSSO/MJO also issue guidance (e.g. the Term Selection Points-to-Consider) and user tools.

Structure and Content

MedDRA is structured as a five-level hierarchy ([26]) ([4]). The top level consists of 26 primary categories called System Organ Classes (SOCs) (e.g. “Cardiac disorders”, “Nervous system disorders”) ([46]) ([47]). Below SOCs are High-Level Group Terms (HLGTs) and High-Level Terms (HLTs) which form intermediate clinically-relevant clusters. At the core are Preferred Terms (PTs) — unique medical concepts each representing a specific diagnosis, sign, symptom, or process. For each PT there are one or more Lowest Level Terms (LLTs) that capture synonyms, lexical variants, or verbatim expressions ([46]) ([3]). A simplified view: LLTs map up to PT; PTs map to one or more HLTs/HLGTs; and each HLT belongs to exactly one SOC (MedDRA allows a PT to be in multiple HLGT/HLT paths, but to one primary SOC for counting purposes) ([48]) ([49]).

Example. The verbatim “upset stomach” might have a corresponding LLT “Gastrointestinal upset” which links to PT “Gastrointestinal disorder” ([50]) ([51]). That PT then appears under HLTs such as “Gastrointestinal disorders” which falls under SOC “Gastrointestinal disorders”. Meanwhile, a different PT “Dyspepsia” exists for a specific clinical entity (“upset stomach” vs “dyspepsia” may require coder judgment) but all would ultimately relate to GI disorders. Because MedDRA is multiaxial, some PTs belong to more than one SOC for retrieval (e.g. “Influenza” is in both “Infections” and “Respiratory disorders” SOCs) ([52]).

Scope. MedDRA covers not only adverse events but also: medical history, therapeutic indications, medical procedures, qualitative results (e.g. “increased”, “absent”), and other related medical concepts ([53]) ([19]). It thereby supports coding of not only AEs but also concomitant conditions and trial-related data (though in practice, most trials only enforce primary use for adverse events and sometimes medical history).

Size. MedDRA has grown substantially. In early versions (~1995) it had only a few thousand terms; by 2012 PT count was ~17,500 ([54]) ([55]). The current version (26.1, 2024) contains over 78,000 LLTs ([3]). PT count is in the tens of thousands. MSSO publishes detailed release notes listing counts of terms by level. The multilingual facet is also important: MedDRA LLTs and PTs have translations in multiple languages to facilitate global use, though English term names are the standard for coding.

Regulatory Adoption and Use

MedDRA is required or strongly endorsed by regulators worldwide. In the European Union and Japan it is mandatory for safety reporting (to NCAs and PMDA) and for submission of clinical trial safety data ([2]) ([4]). The U.S. FDA recommends MedDRA for adverse event reporting (and has incorporated MedDRA extensively into its systems, e.g. for drug labeling) ([2]). Internationally, virtually all pharmaceutical companies use MedDRA for trial and post-marketing safety, even in markets where it is not strictly required, for the sake of consistency and efficiency. For example, one industry training source notes: “Use of MedDRA is required by the EU for pharmacovigilance reporting and is recommended by the FDA. In practice, almost all companies use MedDRA for FDA reporting” ([56]). The MSSO/JMO provide free or reduced-cost licenses to regulatory agencies and non-profits, and commercial licensing to industry.

The ICH E2B standard for electronic adverse event reporting integrates MedDRA. When sponsors submit spontaneous reports or final study reports, the coded terms must be MedDRA PTs at minimum. Guidance documents (e.g. MedDRA Introductory Guide, MSSO points-to-consider) detail conventions such as coding rules and how to handle situations like coding multiple simultaneous events, lab terms, or signs versus diagnosis.

Table 1. Key features of MedDRA and WHODrug dictionaries summarizes some main differences:

FeatureMedDRA (Adverse Events)WHODrug (Medicines)
PurposeStandard coding of adverse events, medical history, indications, procedures ([26])Standard coding of medicinal products and active ingredients ([57])
Maintained byMSSO (US), JMO (Japan) under ICH governance ([45]) ([4])Uppsala Monitoring Centre (WHO Collaborating Centre) ([24])
Hierarchy structure5 levels: SOC > HLGT > HLT > PT > LLT ([26]) ([4])Single index with structured Drug Codes linking trade name and ingredient; adjunct ATC classification levels ([31]) ([23])
Terms coveredAll clinical medical concepts (diseases, symptoms, lab findings, etc) ([26])Medicinal products, ingredients, formulations (including conventional, herbal, vaccines, diagnostics) ([58])
Number of entries~78,000 LLTs (2024, ver.26.1) ([3])≈500,000 trade names (2019) ([5]); ~370,000 in Excel file formats (2024)
ClassificationSOC/HLGT/HLT/PT hierarchy; multiple SOC assignments allowed (one primary) ([48])Anatomical Therapeutic Chemical (ATC) classification; Standardised Drug Groupings (SDGs) for therapeutic/chemical classes ([59]) ([60])
UpdatesBiannual (every 6 months) releases ([12])Regular (frequent) releases; enhancements annually ([61])
LanguagesMultilingual (term translations maintained)English (non-proprietary names); Chinese version available (WHODrug Koda) ([42]); others in development
Regulatory useRequired for adverse event reporting in ICH regions ([2]) (FDA, EMA, PMDA)Required by PMDA (Japan) for concomitant meds in NDAs ([32]); used globally for drug coding; EMA requires ATC codes (ATC5) ([62])
Cost/licenseProprietary (free to regulators, paid to industry)Proprietary (sold by UMC)
Examples of useCoding AEs from CRFs, signal detection, periodic safety reportsCoding concomitant medications, linking to safety database, defining drug classes in analysis

(Table compiled from multiple sources ([26]) ([24]) ([23]).)

Coding Process and Implementation

In a clinical trial’s workflow, data are first collected as free-text by investigators: patients report symptoms, clinicians record “AEs” or “medical history” verbatims on the case report form. Similarly, concomitant medications are recorded as drug name, sometimes with dose and route. These textual entries must then be assigned codewords from the dictionaries:

  • Adverse Event (MedDRA) Coding: A professional (or trained data manager) uses the latest MedDRA to find the matching term. Typically the verbatim description is matched to a MedDRA Lowest Level Term (LLT) whenever possible ([19]). If an exact synonym is not present, a coder picks the most clinically precise Preferred Term (PT) available ([19]). The coder uses knowledge and guidelines (e.g., MedDRA Term Selection: Points to Consider) to decide between terms and how to code multi-faceted or vague event descriptions. EDC (electronic data capture) tools often integrate a MedDRA browser or auto-suggest function to aid this. Consistency within a trial is crucial; sponsors often train coders and document coding conventions. Flexibility is somewhat limited: MedDRA users must not alter the dictionary hierarchy or term meanings (only suggest changes to MSSO) ([63]).

  • Medication (WHODrug) Coding: The verbatim drug name (and possibly strength/form) from CRFs is mapped to an entry in WHODrug. Each WHODrug record (the WHODrug Medicinal Product Data File) includes trade name (brand or generic), active ingredient(s), form, strength, country, marketing authorization holder, etc. Each medication is assigned a unique Drug Record Number plus two sequence parts in the Drug Code (ingredient variant and trade name variant) ([64]). Concomitant drugs often vary by country and brand, so coders use the contextual info (e.g. country or ingredient) to choose the correct WHODrug entry. Many organizations use automated or semi-automated coders (search engines keyed to WHODrug’s trade/generic names) to facilitate this mapping. When coding, a sponsor might list all relevant trade names with a given active substance under one drug code for group analyses.

Once data are coded, adverse events are summarized (e.g. frequency per PT or SOC) and medications are tabulated (e.g. counts by active moiety or drug class). For pooled safety analysis, dataset standards such as CDISC SDTM define fields for MedDRA and WHODrug codes (e.g. AEDECOD, AEDC, etc.) to ensure codified data flows correctly into submission packages.

Use of groupings and queries:

  • For MedDRA-coded events, standardized queries (SMQs) may be applied to capture related events (e.g. “Drugs-related hepatic disorder” might include multiple PTs conceptually linked) ([3]). For example, an investigator might query all terms under the SOC “Investigations” with a laboratory abnormality.
  • For WHODrug-coded meds, the ATC classification (e.g. ATC code J01 for “Antibacterials for systemic use”) can be used to group drugs by therapeutic class. WHODrug also provides user-defined Groupings (Standardised Drug Groupings, SDGs) to classify medications by indication or property beyond ATC.

MedDRA and WHODrug versions must be specified in protocols and submissions. Up-versioning (migrating historical data to a new dictionary release) can introduce coding changes. For MedDRA, term changes (added, modified, deprecated) occur semiannually, and sponsors track these with mapping files (e.g. MedDRA Change Analysis Tool). Similarly for WHODrug, new formulations appear frequently and have new codes. Sponsors often freeze a version for a program or re-code key databases to a common version for consistency ([65]) ([38]).

Advantages of Standardized Coding

Using MedDRA and WHODrug yields numerous benefits:

  • Consistency and clarity: Different sites and languages converge on common terminology. An event described as “heart attack”, “MI”, or “myocardial infarction” will all map to a single MedDRA PT (“Myocardial infarction”), ensuring all similar events are aggregated ([66]). Medications like “Tylenol”, “Paracetamol”, and “Acetaminophen” link to the same WHODrug ingredient code.

  • Data integration: Coded data can be pooled and compared across trials/or products. This enables meta-analyses, aggregated safety summaries, and enables regulatory review of multi-site data.

  • Regulatory compliance: Major authorities require specific coding. For instance, the FDA’s Data Standards Catalog specifies MedDRA for adverse event terminology, and the PMDA requires WHODrug in submissions ([32]) ([19]).

  • Signal detection: Post-marketing surveillance relies on coded data. Spontaneous reports in VigiBase and in company safety databases use MedDRA (for events) and WHODrug (for drugs) so that data mining (statistical disproportionality, case evaluation) can be systematic. For example, WHODrug’s ingredient-centric coding allowed UMC to identify a new safety signal of panic attacks related to desogestrel by grouping all reports of desogestrel-containing products ([40]).

  • Multilingual processing: MedDRA has translations in multiple languages (e.g. Japanese, Spanish) and WHODrug is developing non-English versions (Chinese WHODrug) ([42]) ([67]), easing coding in global trials.

Challenges in Coding and Quality Issues

Despite the strengths, the coding process is imperfect and can affect data interpretation:

  • Inter-coder variability: Multiple studies document differences between coders. Toneatti et al. reported that ~12% of adverse events were coded to different PTs by two coders, and 13% were deemed “non-accurate” by adjudicators ([68]). A systematic review found the same 12% figure and noted 8% of codes deviated from source descriptions ([14]). A recent survey of Norwegian PV coders found only 36% of coders chose the “reference” code in ambiguous cases, with common errors being substitution of terms ([69]). Factors contributing include: coder training, interpretation of vague verbatim language, and the granularity of terms (coders sometimes choose a more general LLT/PT or a more specific one inconsistently) ([36]) ([69]). For example, one coder might code “joint pain” as “Arthralgia” (PT), another might split it into “Myalgia” and “Joint swelling” if context differs. Such inconsistencies can slightly alter AE counts in analysis.

  • Granularity and masking of signals: Because MedDRA terms are highly specific, an AE may scatter into multiple categories. The PLOS review noted that “with the introduction of MedDRA, it seems to have become harder to identify adverse events statistically because each code is divided in subgroups” ([70]). In practice, one often has to manually group terms or use SMQs. Conversely, if coders use very broad terms (e.g. coding “rash” vs “severe rash”), subsequent aggregation may under- or over-count certain effects.

  • Loss of nuance: MedDRA coding abstracts away narrative detail. A physician who recorded an event might omit context that could guide coding; coders must infer. For instance, an event like “chest tightness” might be coded under “Angina pectoris” or “Chest discomfort” depending on interpretation, and such choices vary ([66]). In extreme cases, miscoding can skew trial conclusions: the infamous PARADIGM trial of paroxetine in adolescents initially reported only mild “emotional lability” in treatment group ([39]). Upon review it was revealed that many cases of suicidal ideation had been coded under less alarming terms (e.g. “emotionally labile”) ([39]).

  • Ambiguity and subjectivity: Verbatim descriptions may be ambiguous or layman’s terms. The Norwegian study highlighted that coders struggle with translating lay descriptions to clinical terms, and with synonyms (one coder’s “dizziness” may map to “Vertigo” PT vs another’s “Dizziness” PT) ([36]) ([69]). Lack of context (e.g. knowing whether a symptom is drug-related or new vs pre-existing) can lead to different PT selections.

  • Training and guidelines: MedDRA includes “Points to Consider” guides, but coding often still relies on the coder’s judgment. Organizations may have internal coding conventions, but differences can exist between companies or CROs. Inconsistent application of term selection rules (e.g. whether to code the lowest level vs a synonym PT) leads to divergence. The Mandet practice is to develop and document coding conventions specific to each trial or sponsor ([38]), but this adds overhead.

  • Versioning issues: When a MedDRA version updates, the hierarchy may change (new PTs, changed SOCs). If a trial spans multiple versions, AE codes may become inconsistent. Sponsors sometimes freeze a dictionary version or re-code earlier data uniformly to one version to avoid this confusion ([65]) ([38]).

For WHODrug, analogous issues include:

  • Non-unique drug names: The same brand name may exist in different countries for different formulations. Without context (strength, country), an identical name could map to multiple WHODrug records. WHODrug includes flags or additional data to distinguish “non-unique trade names” ([31]), but coder vigilance is needed to pick the right one.

  • Multiple identifiers: One active ingredient can have many trade names and salt forms. The WHODrug structure links trade names to ingredients, but a coder must know the ingredient (or at least confirm it) to group drugs properly. E.g., “amoxicillin” vs “amoxicillin + clavulanate” have different codes due to combination ingredients ([71]). If the trial data only lists “Augmentin” (brand) on some and “amoxicillin-clavulanate” on others, misalignment can occur.

  • Indication differences: A single WHODrug code may not cover off-label uses if coded differently by authority requirements. One challenge noted is “different WHODrug identifiers may apply when a single drug is used for different indications” ([72]), because variations (dosage or formulation) can vary by indication.

  • Dictionary updates: WHODrug is updated regularly with new products. In long trials, a medication introduced mid-study may not be in the old dictionary version; testers then either use a placeholder code or recode when updating. UMC provides change analysis tools to recommend merging or splitting codes between versions ([65]). Failure to update may lead to missing coding or catch-all entries.

Data Analysis and Interpretation

The coded data become the basis for safety analysis. Key points:

  • Aggregations by MedDRA: Analysts can tabulate counts of events by SOC or preferred term ([39]). For example, frequency tables of “Headache” (PT) by treatment arm. Because MedDRA is multiaxial, analysis can include or exclude primary vs secondary SOC mapping depending on context.

  • SMQs (Standardised MedDRA Queries): The MSSO defines sets of PTs related to particular conditions (e.g. hepatic events, immune-mediated reactions) ([3]). Analysts use SMQs to capture cases that might span multiple PTs within MedDRA. For instance, an SMQ for “Cardiac Arrhythmias” may include dozens of PTs like “Atrial fibrillation”, “Ventricular tachycardia”, etc. This helps overcome the fragmentation effect. However, SMQs require careful selection and are updated with MedDRA versions ([19]) ([3]).

  • MedDRA in regulatory submissions: In clinical study reports and aggregate safety narratives, reported AEs are typically listed by PT and summarized by SOC. Graphical plots (e.g. volcano plots of AE incidence) often use MedDRA terms. Labeling sections (like the safety section of an investigator brochure or drug label) ultimately derive from MedDRA-coded data.

  • Aggregations by WHODrug: Common analyses include listing the most frequent concomitant medications by ATC class or ingredient. For example, in an oncology trial one might note that 30% of patients took any antiemetic (e.g. ATC A04). ADaM or other SDTM data sets may use variables like --ATCC. Analysis programs can group by active moiety or ATC level to see class effects (e.g. how many took any NSAID, anti-hypertensive, etc).

  • Safety signal evaluation: Because WHODrug links related products (same ingredient, different brands) under one code structure, company safety databases can readily pull all ICSRs involving any formulation of a drug. As noted, searching on the active ingredient in WHODrug can reveal signals that might have been missed if searching only one brand name ([40]). This is critical in post-marketing PV.

  • SDGs (Standardised Drug Groupings): To analyze protocol deviations or interactions, sponsors often use WHODrug’s SDGs which cluster drugs by indications or pharmacology (e.g. all CYP3A4 inhibitors, all QT-prolonging antihistamines). This structured grouping can be more intuitive than raw ATC codes ([60]).

Case Studies and Real-World Examples

  • Inter-coder Variation (Toneatti 2005): In an early study by Toneatti et al. ([73]), two experienced coders independently coded 260 AE verbatim reports from a clinical trial using MedDRA. They disagreed on the PT in 12% of cases, often choosing different but related terms. A review committee judged that 8% of the codes were inaccurate relative to expert judgment. This highlights that even trained coders can diverge appreciably, especially on difficult descriptions.

  • Misleading Coding in Published Trial: The paroxetine adolescent depression trial (Study 329) famously misrepresented harms. The coding process labeled suicidality-related events under non-serious terms “emotional lability” or “behavioral problem”, concealing the true safety issue ([39]). This case underscores that coding is not purely mechanical and can be influenced by subjective decisions (or, worst-case, by bias).

  • Signal Detected via WHODrug Query: Lagerlund et al. describe a pharmacovigilance case where a suspected link between the contraceptive desogestrel and panic attacks was first noted by searching WHODrug ([40]). By querying all adverse event reports for any product containing desogestrel (regardless of brand name), analysts identified a cluster of reports coded as “panic attacks and disorders” that putatively implicated desogestrel. This would not have been easily seen if only brand-specific coding were used. The WHODrug structure enabled expanding the search by ingredient.

  • SDG Use in Trials: Baermann and Frischmann (PhUSE 2013) discussed using WHODrug SDGs to compile protocol criteria. For example, an exclusion list might include “CYP2D6 inhibitors” – an SDG could list all medications in that category for consistent capture ([74]). In practice, some sponsors map their eCRF checkboxes for exclusion to an SDG query.

  • Version-upcoding in Long Trials: Datamanagement365 (2021) noted that multi-year studies face the challenge of coding continuity. When a new MedDRA or WHODrug release comes out, some eCRF entries may remain mapped to old codes. If a trial must report in the new version, either historical data must be updated (potentially changing code frequencies, if terms were split/merged) or separate analyses must be handled. Tools like WHODrug’s Change Analysis Tool have been created to quantify these impacts ([65]).

  • Patient vs Regulator Coding (FDA Study 2018): The MedDRA-focused study among patient groups and regulators (not a trial per se) found <3% disagreement in code assignment; patients tended to pick more general PTs than regulators ([75]). This suggests that with training, even novice coders can largely match expert selections, but being aware that patient-reported events might be coded differently is important in pharmacovigilance.

Implications and Future Directions

MedDRA and WHODrug are cornerstones of clinical safety data management, but the landscape is evolving:

  • Interoperability with Electronic Health Records (EHRs): As healthcare systems adopt standards like SNOMED CT for clinical data, mapping between SNOMED and MedDRA becomes important. Richesson et al. (2009) argued that either MedDRA should adapt or robust mappings should be developed to allow re-use of clinical EHR data for AE reporting ([76]) ([77]). Current practice often involves coding AEs to MedDRA after collection, but longer-term, automatic translation from SNOMED-based diagnoses to MedDRA for safety submissions may emerge.

  • Automation and AI: The volume of data motivates use of natural language processing. For MedDRA, some companies are implementing autocoders that use NLP and machine learning to suggest MedDRA terms from verbatim text ([78]). For WHODrug, the tool “WHODrug Koda” uses AI to pre-code medications. However, these automated solutions still require human oversight due to the subtlety in medical language. Their accuracy is improving, and they promise to reduce manual effort, especially for large safety databases.

  • Enhanced Dictionaries: The core structures of MedDRA and WHODrug have not changed dramatically, but there are expansions:

  • Multilingual user interfaces: UMC is developing non-English WHODrug functionality (e.g. Chinese characters) ([42]). MedDRA already has many language versions, but there is interest in facilitating coding in native languages by better translation support.

  • Special queries and groupings: Both dictionaries now support more advanced query tools (Standardised MedDRA Queries, MedDRA Clinical Outcomes Assessments (COA) subset, etc.). Similarly, WHODrug’s SDGs and custom grouping tools improve search strategies for combined medication properties ([60]).

  • Feedback-driven updates: MSSO holds meetings with users (e.g. MedDRA Blue Ribbon Panels) to incorporate emerging industry needs (e.g. new adverse event concepts, better dictionary IDs). Likewise, UMC continuously adds new medicinal products from markets.

  • Regulatory and Industry Trends: The increasing globalization of trials means eventual widespread adoption of these standards is inevitable. Possible future changes:

  • FDA mandates: FDA may formalize WHODrug requirements beyond pharmacovigilance (since it already requires MedDRA for AEs). Already, CDISC standards incorporate these coding dictionaries for SDTM domains (AE, CM for concomitants).

  • Expanded roles: There is discussion of using MedDRA for structured data capture at the point of care (e.g. in eCRFs, SOCs or PTs could be selectable by investigators when describing an event). This would minimize post-hoc coding but requires user-friendly interfaces.

  • Harmonization with classifications: If ICD-11 becomes more widespread, mappings between ICD-11 and MedDRA may be needed for legacy data integration. WHO’s ICD (International Classification of Diseases) and MedDRA serve overlapping but not identical purposes; efforts may refine crosswalks in the future.

  • Enhanced coding metrics: The research community may develop metrics of coding quality (similar to adjudication of events). For example, establishing routine inter-coder checks could become standard in large trial teams.

  • Training and Governance: The literature reviewed emphasizes training: targeted programs to reduce the types of errors seen (substitution, omission) ([36]). Professional coding networks and certification (e.g. CDISC or health informatics groups) may play roles in continuing education. Clear documentation of coding decisions is recommended by ICH (compliance and reproducibility).

Overall, the future points toward more integrated and automated coding – but also the recognition that coding is a critical process warranting rigor. As new therapies (e.g. gene therapies, complex biologics) and novel endpoints arise, the dictionaries will likely continue expanding. Ensuring consistency in their application remains essential to the reliability of safety evaluations.

Conclusion

MedDRA and WHODrug are fundamental instruments in modern clinical trial data management. By providing a shared language for reporting adverse events and medications, they enable robust safety monitoring, cross-study comparisons, and streamlined regulatory submissions. The adoption of MedDRA by ICH regulators and WHODrug by bodies like the PMDA underscores their global importance.

However, extensive evidence shows that coding is not foolproof: inconsistent term selection by coders can affect data interpretation ([14]) ([36]), and the sheer granularity of these dictionaries can complicate signal detection and data retrieval ([70]) ([3]). Clinical trial organizations must therefore invest in proper coder training, use of coding tools, and quality assurance procedures (e.g. double-coding, audits, clear coding conventions ([38]) ([36])). Regulatory guidance documents (ICH points-to-consider, FDA data standards) and industry consortia (PhUSE, CDISC) continue to provide valuable best practices for term selection, query design, and data integration.

Looking forward, the synergistic use of MedDRA and WHODrug in trial analysis is likely to grow. Their complementarity – one coding what happened to the patient, the other what was given to the patient – means future analytics (such as pharmacoepidemiology within trials) will increasingly rely on their interoperability. Initiatives like embedding MedDRA/WHODrug in EHRs, enhancing cross-terminology mappings, and advancing AI-assisted coding all aim to maximize the value of coded data.

In summary, MedDRA and WHODrug have transformed the handling of safety and medication data in clinical research. They bring order and clarity to otherwise heterogeneous data. Ensuring their continued quality and relevance will require ongoing collaboration among regulators, industry, informaticians, and clinicians. With rigorous application and continuous improvement, these dictionaries will support the ultimate goal of clinical trials: safe and effective patient care.

References:

  • Lagerlund et al. (2020). WHODrug: A Global, Validated and Updated Dictionary for Medicinal Information. Ther Innov Regul Sci. 54(5):1116–1122 ([6]) ([59]).
  • Richesson et al. (2008). Heterogeneous but “Standard” Coding Systems for Adverse Events: Issues in Achieving Interoperability between Apples and Oranges. Contemp Clin Trials. 29(5):635–645 ([79]) ([46]).
  • Bennekou Schroll et al. (2012). Challenges in Coding Adverse Events in Clinical Trials: A Systematic Review. PLOS ONE 7(7):e41174 ([14]) ([26]).
  • Chan et al. (2021). The Utility of Different Data Standards to Document Adverse Drug Event Symptoms and Diagnoses: Mixed Methods Study. J Med Internet Res. 23(12):e27188 ([80]) ([81]).
  • Garmann et al. (2025). Strategies and Challenges in Coding Ambiguous Information Using MedDRA®: An Exploration Among Norwegian Pharmacovigilance Officers. Drug Saf. 48:1253–1269 ([3]) ([69]).
  • EvidentIQ. The MedDRA and WHODrug Dictionaries: What Are They and How Are They Used? (2022) ([67]) ([72]).
  • ICH. MedDRA® Data Retrieval and Presentation: Points to Consider, Release 3.25 (Mar 2025) ([33]) ([38]).
  • FDA. FDA Data Standards Catalog, v5.2 (2018) ([2]).
  • UMC. WHODrug Global Implementing Guide (various releases) ([82]) ([83]).
  • Additional references cited inline throughout the text (PMC and print journal articles).

External Sources

DISCLAIMER

The information contained in this document is provided for educational and informational purposes only. We make no representations or warranties of any kind, express or implied, about the completeness, accuracy, reliability, suitability, or availability of the information contained herein. Any reliance you place on such information is strictly at your own risk. In no event will IntuitionLabs.ai or its representatives be liable for any loss or damage including without limitation, indirect or consequential loss or damage, or any loss or damage whatsoever arising from the use of information presented in this document. This document may contain content generated with the assistance of artificial intelligence technologies. AI-generated content may contain errors, omissions, or inaccuracies. Readers are advised to independently verify any critical information before acting upon it. All product names, logos, brands, trademarks, and registered trademarks mentioned in this document are the property of their respective owners. All company, product, and service names used in this document are for identification purposes only. Use of these names, logos, trademarks, and brands does not imply endorsement by the respective trademark holders. IntuitionLabs.ai is an AI software development company specializing in helping life-science companies implement and leverage artificial intelligence solutions. Founded in 2023 by Adrien Laurent and based in San Jose, California. This document does not constitute professional or legal advice. For specific guidance related to your business needs, please consult with appropriate qualified professionals.

Related Articles