IntuitionLabs
Back to ArticlesBy Adrien Laurent

AI Governance in Pharmacovigilance Signal Detection

Executive Summary

The advent of artificial intelligence (AI) and machine learning (ML) has ushered in a transformative era for pharmacovigilance (PV), particularly in the realm of safety signal detection. Global pharmaceutical consortia and regulatory bodies alike recognize the potential of AI to enhance drug safety monitoring, but also the need for robust governance to manage risks. In late 2025 the Council for International Organizations of Medical Sciences (CIOMS) formally released its Working Group XIV report on Artificial Intelligence in Pharmacovigilance ([1]). This landmark report establishes a consensus “safety net” of principles—ranging from risk-based oversight and human participation through transparency, data quality, and accountability—to guide the responsible deployment of AI in PV ([2]) ([3]). Simultaneously, TransCelerate BioPharma, an industry coalition, has produced extensive resources (surveys, case studies, frameworks) on emerging digital tools in PV ([4]) ([5]).

These parallel initiatives underscore a convergence: achieving the promise of AI (greater speed, coverage and insight in signal detection) requires equally rigorous governance structures. AI-augmented PV can process immense volumes of Individual Case Safety Reports (ICSRs), integrate diverse data sources (electronic health records, literature, social media) and identify complex patterns days or weeks faster than traditional methods ([6]) ([7]). Industry pilots suggest striking gains – for example, AI-assisted ICSR triage has been shown to cut manual processing time by ≈67%, shaving 20 minutes per case ([8]) ([9]). However, these potent tools also raise concerns: algorithmic bias, data privacy, “black-box” uncertainty, and the risk of undetected errors with patient-safety implications. Thus, both CIOMS and TransCelerate urge that governance be built in – through risk-based strategies, validation protocols, continuous monitoring, and clear accountability – to ensure that AI serves as an enabler of better signal detection without compromising patient safety ([10]) ([3]).

This report provides a comprehensive analysis of the current state and future direction of AI in PV signal detection, focusing on governance frameworks. We begin with background on pharmacovigilance and historic motivations for AI, then review TransCelerate’s and CIOMS’s roles in shaping best practices. We examine how AI/ML techniques (including NLP and large language models) are applied to signal detection and the evidence of their performance ([11]) ([6]). A central section dissects AI governance: key principles (risk-proportionate oversight, explainability, data ethics) and practical models (validation strategies, oversight committees, phased implementation) from industry and regulatory sources ([2]) ([12]). Case studies illustrate how organizations pilot and scale AI-based signal processes under these guidelines (e.g., parallel human review during initial triage) ([13]). Lastly, we discuss implications: regulatory trends (FDA’s Emerging Drug Safety program, EMA draft reflections ([14]) ([15])), workforce readiness, and emerging innovations like federated learning and predictive safety modeling ([16]) ([17]).

In summary, translating AI’s promise into patient benefit mandates disciplined governance. The guidance of CIOMS WG XIV and TransCelerate, backed by academic and industry research, provides a practical roadmap: embed human expertise at critical junctions, demand transparency and continuous validation, and align AI use with fundamental PV objectives. With these frameworks, AI-enabled signal detection can evolve PV from a reactive process to a proactive, data-driven guardian of drug safety.

Introduction and Background

Pharmacovigilance (PV) is the discipline of monitoring the safety of medical products post-launch, primarily through the collection and analysis of adverse event reports. At its core is signal detection – the process of identifying new or unexpected adverse drug reactions (ADRs) from routine data sources ([18]). Conventionally, sponsors and regulators rely on statistical disproportionality methods (e.g. Proportional Reporting Ratio, Reporting Odds Ratio) applied to individual case safety report (ICSR) databases, as defined in ICH guidances and CIOMS frameworks ([18]). These techniques have been effective but suffer inherent limitations: they work in periodic batches (leading to time lags of weeks), depend on coded structured data (ignoring narrative details), and often treat data sources (e.g. national ICSR systems) in isolation ([19]) ([20]). Meanwhile, the volume and heterogeneity of PV data are skyrocketing. The FDA alone receives over 4.3 million ICSRs annually ([9]). New unstructured sources – electronic health record (EHR) data, medical literature, social media posts – offer vast safety information but evade classical methods without intense manual curation.

In parallel, artificial intelligence (AI) and machine learning (ML) have matured dramatically. Modern ML models (decision trees, ensemble learners, neural networks) and NLP techniques can process large, complex datasets and adapt autonomously.In PV, this means algorithms can sift through millions of reports, extract meaning from narratives, and flag subtle patterns that humans or simple statistics might miss ([6]) ([20]). For example, deep learning has detected early signs of drug-induced retinopathy years before clinicians did, and NLP models can parse patient complaints on forums for emergent safety signals ([21]) ([16]). These capabilities promise to shift PV from “warm-start” retrospective analysis to cold-start predictive surveillance, potentially identifying risks even in early clinical stages ([17]).

TransCelerate BioPharma Inc., a non-profit consortium of major pharmaceutical companies, has been proactive in harnessing AI for PV. Its Intelligent Automation Opportunities in PV initiative (completed in 2025) surveyed member companies and developed practical tools on automation (including AI, robotic process automation, NLP) across PV processes ([22]) ([4]). TransCelerate’s research, published in Drug Safety (2022), documented a clear industry trend: life sciences organizations are rapidly moving from planning to piloting and deploying intelligent automation in case processing ([23]). These surveys found that companies are using multiple AI techniques in tandem (e.g. ML for coding and RPA for data entry) and highlighted top challenges – chiefly the scarcity of high-quality training data and a lack of harmonized regulatory guidance for AI validation ([4]). TransCelerate has since produced guidance assets (e.g. technology matrices, case studies on signal management automation ([5])) to help companies navigate these challenges.

Meanwhile, CIOMS Working Group XIV (WG XIV) was formed under the auspices of the World Health Organization in 2022 specifically to address AI in PV. This interdisciplinary group (regulators, industry, academia, patient reps) met through 2025, culminating in a public-consultation draft (May 2025) and final report (Dec 2025) on Artificial Intelligence in Pharmacovigilance. The CIOMS report explicitly frames AI in PV as a “safety net” challenge: it offers transformative tools but demands a commensurate ethical/operational framework. In its summary, CIOMS WG XIV introduced seven guiding principles – risk-based oversight, human oversight, validity, transparency, privacy, fairness, and governance – all designed to ensure AI enhances rather than undermines drug safety ([2]) ([1]). These principles underscore that patient safety primacy must not be compromised by pursuit of efficiency gains ([24]) ([2]).

Scope of Report: This document examines the intersection of TransCelerate’s industry initiatives and the CIOMS WG XIV framework in the context of PV signal detection (especially by AI). We analyze current technologies and evidence in AI-enabled signal detection, then focus on AI governance – the policies, processes, and oversight needed to implement AI in PV safely. The goal is to present a deep, evidence-based synthesis drawing on academic studies, industry reports, and regulatory publications. We consider multiple perspectives (industry, regulatory, ethical), include quantitative data where available, and discuss real or conceptual case studies. Ultimately, we aim to elucidate how AI can be responsibly governed in PV — turning innovative models into reliable, compliant signal systems.

TransCelerate’s Initiatives in Pharmacovigilance

TransCelerate Overview

TransCelerate BioPharma Inc. is a global non-profit consortium of major pharmaceutical companies founded in 2012. Its mission is collaborative innovation to accelerate the delivery of new medicines. Members share resources and best practices on clinical and regulatory science. In pharmacovigilance, TransCelerate has formed working groups to address common challenges. Notably, the Interpretation of Pharmacovigilance Guidances & Regulations (IGR PV) initiative produces guides to harmonize diverse global PV rules ([25]). Another key initiative has been Intelligent Automation in PV, aiming to harness emerging technologies (AI/ML, robotics, NLP) to improve efficiency and quality in safety reporting ([22]).

TransCelerate’s value lies in combining industry expertise into consensus-driven outputs. This collective approach reduces duplication of effort and provides regulators with unified industry positions. For example, TransCelerate’s IGR PV published manuscripts clarifying contentious safety reporting topics (e.g. definition of anticipated SAEs, reference safety information) ([26]). Similarly, its AI/automation work examines data standards, best practices, and change management considerations across large member companies, ensuring that any tools or frameworks are practical for industry-wide adoption ([4]) ([27]).

Intelligent Automation in PV

The Intelligent Automation Opportunities in Pharmacovigilance initiative (completed April 2025) focused on identifying how “intelligent automation” technologies can streamline PV processes ([28]). The initiative performed an impact assessment and worked with global health authorities to vet risks ([29]). Its deliverables include:

  • Surveys of Company Adoption (2019–2021): As reported in Drug Safety (2022), TransCelerate surveyed member companies on their use of automation in PV. The findings showed rapid adoption: firms have moved from planning to pilot to production for many rule-based automations, and are increasingly applying ML/AI for tasks such as event coding and duplicate detection ([23]).
  • Technology Matrix: An interactive catalog classifying various AI/automation tools and their readiness for PV tasks (available on TransCelerate’s website).
  • Case Study Themes: For example, an “AI-based Validation Case Study Themes” asset describes scenarios of validating ML models in PV. ([27]).
  • Signal Management Survey: A report “Intelligent Automation Opportunities for Signal Detection” (November 2023) summarized industry needs and experiences in automating signal management processes ([27]).
  • Implementation Tools: Tools like the Interactive ICSR and Automation Technologies Tool (IATT) help companies match processes to suitable technologies ([30]).
  • Published Literature: TransCelerate members have published springer articles drawing on these efforts (e.g. Kassekert et al., 2022) ([23]).

From these resources, some key industry perspectives emerge:

  • Broad Interest: Nearly all large pharma companies are exploring AI/ML in PV. They see it as delivering “human-like interpretation of data” to augment decision-making in case processing ([4]).
  • Combined Technologies: Solutions often integrate multiple tools (e.g. ML + robotic process automation) for the same task ([4]).
  • Primary Use Cases: Initial focus areas include duplicate report detection, automated MedDRA coding of events, and extracting data from narrative case descriptions ([31]) ([32]).
  • Challenges: The two biggest hurdles are training data (acquiring sufficient, high-quality annotated data for ML) and regulatory uncertainty. Companies reported difficulty in obtaining representative datasets to train and validate models ([33]). They also expressed a need for clearer regulatory guidance to know how to qualify their AI systems ([33]).

TransCelerate’s outputs are therefore geared toward addressing these hurdles: by educating PV teams on technology capabilities, benchmarking which steps can be automated, and providing draft protocols for model validation. For example, TransCelerate’s repository includes an outline for validating intelligent automation systems consistent with quality frameworks .

Overall, TransCelerate plays a dual role: (1) Innovator & Educator, by pooling lessons from member pilots to advance PV technology (as evidenced in conference talks and publications), and (2) Facilitator of Harmonization, by engaging regulators (as it has done for other guidances) to ensure new tools fit into existing safety frameworks. In the AI era, such industry-wide collaboration helps translate the fast-moving AI frontier into robust industry standards.

The Role of AI in Signal Detection – Technology Perspective

AI in pharmacovigilance has most tangibly impacted signal detection, where complex data challenges align with AI strengths. This section delves into how AI/ML techniques augment traditional methods, with evidence from recent studies and practical reports.

Traditional Signal Detection vs AI-Enabled Detection

Traditionally, PV signal detection is performed by periodically analyzing the incidence of drug-event combinations in ICSR databases. Methods like the Proportional Reporting Ratio (PRR) or Bayesian Confidence Propagation Neural Network (BCPNN) flag combinations that occur disproportionately more often than expected ([19]). These methods operate on structured data and typically run in batch (e.g. monthly or quarterly) or ad-hoc on demand ([34]). Their limitations include:

  • Time Lag: Batch processing means signals can only be detected periodically. A new safety issue might emerge weeks or months before the next analysis arrives ([19]).
  • Structured Data Dependency: If events are buried in free-text (e.g. narrative descriptions, forum posts, literature), traditional disproportionality misses them unless a human has coded them into structure ([19]) ([35]).
  • Single-Source Focus: Conventional algorithms typically analyze one database at a time; signals that only emerge by correlating multiple sources (e.g. sales data plus EHR plus spontaneous reports) can be missed.
  • Bias towards Known Patterns: Disproportionality assumes known event categories. It may under-detect novel or complex interactions (e.g. idiosyncratic ADRs triggered by genetic factors) ([19]).

AI-powered detection addresses these gaps through several capabilities ([6]):

  • Continuous Surveillance: ML models can analyze data as it arrives in real time. As Yong et al. describe, rather than quarterly snapshots, AI enables continuous monitoring that can spot emergent patterns days or weeks ahead of schedule ([6]).
  • Multi-Source Integration: Advanced algorithms can simultaneously ingest heterogeneous data. An AI model might fuse ICSRs with electronic health record signals, literature-derived insights, and patient-generated data (social media, registries) ([16]). For example, it can correlate a spike in EHR-recorded chest pain with related posts on patient forums, triangulating a signal that neither source alone would reveal.
  • Unstructured Data Mining: Natural Language Processing (NLP) extracts information from free text. Crucial details in an ICSR narrative (e.g. “patient developed severe neuropathy 4 weeks after dosing”) can be parsed and structured for analysis ([35]). Similarly, NLP can monitor unstructured sources: one pilot successfully screened social media for COVID-19 vaccine side effects ([36]).
  • Capturing Complex Relationships: Deep learning models can learn non-linear patterns and interactions beyond raw counts. As SakaraDigital notes, they can detect subtle shifts (temporal clustering, co-medication interactions) that standard ratios miss ([20]). In practice, ensemble ML methods (random forest, gradient boosting) have shown markedly higher accuracy than PRR/ROR on benchmark tasks ([37]).

Evidence of AI Effectiveness

Academic reviews and pilot studies collectively indicate that AI can indeed enhance signal detection, though careful design is needed. A recent open-access review found that “machine learning algorithms generally outperformed traditional frequentist or Bayesian measures of disproportionality” in retrospective analyses ([11]). In several studies, supervised ML classifiers (gradient boosting, random forests) achieved very high ROC-AUC scores (up to ~0.97) for identifying known safety signals while controlling false alarms ([37]). For example, one model trained on a curated positive/negative control dataset achieved an area under the ROC curve of 0.973, versus 0.458 for standard PRR on the same set ([37]). Although these are idealized figures (often using enriched datasets), they demonstrate the potential of modern AI to dramatically boost sensitivity without sacrificing specificity.

NLP tools have also proven effective. MedLEE (an older NLP system) achieved ~97% sensitivity in extracting clinical concepts from emergency dept. notes compared to human review ([38]). More recent transformer-based LLMs (e.g. GPT-3.5) have shown robust abilities to comprehend medical text and classify adverse event reporting forms ([39]). In one head-to-head, GPT-3.5-assisted triage correctly identified 78% of COVID-19 vaccine signal outbreaks (versus 65% by a baseline model) ([39]), effectively dismissing many irrelevant signals.

Practical Pilot Results: Industry case reports further illustrate gains. One whitepaper cites an example where automating the coding and triage of non-serious ICSRs yielded a 67% reduction in processing time ([9]). AI-based duplicate detection reduced redundant cases by ~23%, and LLM-assisted data extraction saved ~20 minutes per case ([8]). In a cited clinical use-case, a model flagged hydroxychloroquine-induced retinopathy years ahead of the typical clinical diagnosis, enabling early intervention ([21]). While such dramatic cases await broader validation, they underscore AI’s ability to surface signals from subtle or early patterns that manual review might overlook.

In addition, innovative approaches are emerging. Federated learning – where multiple organizations train a shared model on their private data – is being explored for signal detection. By allowing companies and health systems to collaboratively build AI models without exchanging raw data, federated methods could dramatically improve statistical power without violating privacy ([40]). If realized, federated signal surveillance could draw on global datasets in a way unachievable by any single entity, surfacing rare ADRs in smaller markets or patient subgroups.

Overall, current evidence suggests AI can be a powerful supplement to traditional PV analytics ([11]) ([6]). Machine learning excels at high-volume, repetitive tasks (triage, coding) and at pattern recognition in big data ([6]) ([20]). It is already being applied in practice for these purposes. However, signal detection is safety-critical. The same studies also repeatedly emphasize that performance must be interpretable and validated in context. AI’s ability to spot a signal should not substitute for human medical judgment; rather, it should augment analysts by presenting prioritized hypotheses for further evaluation ([41]) ([42]).

Implementing AI Governance in Signal Detection

The previous section highlighted AI’s capabilities in PV. Equally important is how to deploy AI so that its benefits are realized safely. This calls for a comprehensive AI governance framework – policies, processes, and oversight to manage AI systems in PV. Both CIOMS WG XIV and industry experts agree that governance must address multiple dimensions: risk management, human oversight, validation, transparency, data ethics, and accountability. Here we analyze these aspects and present structured approaches drawn from recent literature and guidelines.

Core Governance Principles

CIOMS WG XIV distilled several core governance principles – essentially guardrails – for AI in PV ([2]) ([3]). We summarize and align these with broader AI best practices:

  • Patient Safety Primacy (Risk-Proportionate Oversight): The overarching rule is that AI must improve or at least not diminish patient safety. Oversight and rigor should scale with risk: a simple triage bot on non-serious ICSRs can be validated with lighter controls than an AI that directly generates safety signals or label changes. SakaraDigital highlights this “risk-proportionate” approach, noting that controls for automating non-serious case intake differ from those for autonomous signal evaluation ([43]). CIOMS similarly emphasizes risk-based oversight and a phasing-in of AI according to criticality ([2]) ([44]).
  • Human-in-the-Loop / Human Oversight: Qualified PV professionals must remain responsible for decisions. Even when AI suggests actions, a human must review and concur before regulatory submissions or patient-impacting actions ([45]) ([46]). WG XIV defines “Human-in-the-Loop” (HITL; humans train or review each output) and “Human-on-the-Loop” (HOTL; humans periodically audit that AI is working correctly) and calls for implementing these as appropriate ([45]). For example, a pilot might begin with all AI-flagged signals confirmed by safety scientists before acting.
  • Validity and Robustness: AI models must be fit for purpose, performing reliably on real-world data. This means rigorous validation before deployment and continuous monitoring for drift ([47]) ([48]). Validation should cover range of use cases and inputs, ensuring models properly handle data variability and edge cases ([47]). CIOMS expects organizations to demonstrate that their AI consistently meets predefined performance metrics even as data evolves.
  • Transparency and Explainability: Users must be able to understand or at least audit AI outputs. “Black box” models have limited acceptability in a regulatory context. Whenever possible, organizations should use interpretable models or provide explainability tools that clarify AI reasoning ([49]) ([50]). Even when complex neural networks are used, their decision logic (and any uncertainty) should be documented so that safety experts can challenge or contextualize results ([47]). CIOMS and others stress disclosing both when AI is used and why a given output (e.g. "this case is high-risk") was generated, to build trust and facilitate inspection.
  • Data Privacy and Security: AI in PV often uses sensitive personal health data ([51]). Governance must ensure compliance with privacy laws. Special care is needed with large language models: e.g. CIP or HIPAA data should not leak through an LLM’s training or outputs. De-identification, encryption, and privacy-preserving architectures (e.g. on-premises models) are necessary to protect patients. CIOMS explicitly warns that generative AI could re-identify patients from “anonymized” PV datasets if misused ([51]).
  • Fairness and Bias Mitigation: ML models learn from historical data, which can embed biases (e.g. underreporting in certain countries or demographic groups). Governance must include bias audits and corrective measures ([51]) ([52]). For instance, if social media data skews younger, an AI signal system might miss ADRs prevalent in older patients. Companies should test models on diverse subpopulations and ensure equitable signal sensitivity. The CIOMS report underscores that AI should not amplify inequities; fairness checks must be documented.
  • Accountability and Governance Structure: Ultimately, humans (and organizations) are responsible for AI, not the software itself ([53]) ([12]). Clear roles and accountability frameworks must be defined: Who approves an AI tool? Who signs off on its outputs? Who monitors its compliance? CIOMS makes “governance and accountability” a core principle (not a soft suggestion) ([53]). Practically, this means high-level oversight committees, model risk teams, and formal procedures for change management.

These principles, synthesized from CIOMS, SakaraDigital and other sources, form the backbone of an AI governance policy for PV. Next, we detail operationalizing these concepts in the lifecycle of signal detection AI.

Organizational Governance and Oversight

Translating principles into practice requires specific structures. Expert recommendations typically propose layered governance, as illustrated below:

  • AI PV Steering Committee: A cross-disciplinary body (PV, data science, IT, quality, regulatory) that provides strategic direction. It approves major deployments, allocates resources, and resolves conflicts. The SakaraDigital blog suggests constituting such a committee to oversee AI strategy ([54]).
  • Data Governance Board: Responsible for the datasets driving AI. It sets standards for data quality, lineage, and bias assessments. For example, this board would review which ICSR fields are used, how missing data is handled, and how training sets are curated ([55]) ([56]).
  • Model Risk Management (MRM) Function: An independent review team (akin to financial institutions’ MRM) that challengers each model’s development and controls. MRM ensures no single PV or IT group has unchecked authority. It replicates the risk assessment and validation that the developer did, providing a second opinion ([57]).
  • Change Advisory Board: For computer systems, any significant change (e.g. to AI model version or config) typically undergoes CAB review. The governance model extends this concept: any retraining, parameter tweak or data pipeline change passes through a formal review for regulatory impact ([55]).
  • Quality Assurance (QA)/Validation Group: Works closely with MRM to audit the validation process. In a regulated context, QA ensures the CSV documentation meets standards. For AI, QA must adapt validation docs (see Table Adapting Validation later).

By mapping out these roles and committees, an organization establishes who makes AI-related decisions and how. Documentation of this governance – charters, meeting minutes, decision logs – is itself vital evidence for regulators that AI use is controlled.

AI Model Lifecycle and Validation

A characteristic theme is that AI model deployment is not “set-and-forget.” Models degrade as data shifts or as new drug patterns emerge. Governance requires a lifecycle approach ([58]):

  1. Foundation Phase (Months 1–6): Before any model coding, ensure prerequisites are in place. Key activities include:
  • Data Assessment: Evaluate the quality and completeness of PV data sources. Clean, normalize, and, if needed, link disparate systems (e.g. unify legacy ICSR, EHR and literature data) ([59]).
  • Use-case Selection: Perform risk-benefit analysis to pick candidate use cases. As SakaraDigital advises, prioritize high-volume, routine tasks where error can be tightly controlled, such as coding non-serious ICSRs ([42]).
  • Governance Setup: Formulate the committees and procedures above. Draft AI governance policies (validation plan template, roles/responsibilities matrix).
  • Infrastructure: Deploy the computing stack (data lakes, model development tools) and define technical standards (e.g. MedDRA/E2B for inputs ([58])).
  • Risk Assessment: For each planned AI use, identify failure modes (misclassification, bias). Document risk mitigation (e.g. thresholds to trigger human review).

Phase 1 deliverable example: A validated dataset of past ICSRs with coded outcomes that the pilot AI will learn from.

  1. Pilot Phase (Months 4–12): Begin controlled AI deployment in non-critical settings ([13]). SakaraDigital outlines a typical pilot plan:
  • Scope Definition: E.g. “Automate triage of non-severe, non-expedited ICSRs.” This is an example of a low-risk use-case recommended by AI governance guides ([60]).
  • Parallel Operation: Run the AI model in shadow mode: it scores cases, but humans still make final decisions. This allows measurable comparison (the “ground truth” from human processing vs AI’s suggestions) ([13]).
  • Validation and Testing: Collect performance data (accuracy, recall, time saved). Adjust model hyperparameters and retrain if necessary, under the oversight of QA and MRM.
  • Human Oversight: Ensure that at defined points in the workflow, humans must verify AI outputs. For instance, a working procedure might say “If the AI flags a serious event, a senior PV physician must review the case.”
  • Documentation: Record all changes to model parameters, validation results, and any issues. This audit trail is critical for inspection readiness ([61]).

Phase 2 deliverable example: Proof-of-concept report showing AI maintained ≥95% concordance with human triage on a test set, along with recommendations for going live.

  1. Expansion Phase (Months 10–24+): Once pilots demonstrate acceptable performance, scale up.
  • Broaden Use Cases: Introduce AI to more complex processes (e.g. flagging ICSRs for expedited reporting, supporting signal generation).
  • Integration: Embed AI systems into existing workflows and databases. For example, an AI module might feed its high-confidence signal alerts directly into the signal management dashboard.
  • Continuous Monitoring: Implement automated checks for model drift (e.g. input data distribution monitoring) and key performance indicators (are false alarms creeping up over time?) .
  • Retraining Cycles: Regularly refresh models with new data. Governance protocols should stipulate when retraining is triggered and require re-validation before redeployment.
  • Audit and Inspection Preparedness: Maintain up-to-date records (model cards, training data snapshots, change logs) in anticipation of regulatory audits.

Table 1 below contrasts traditional computer system validation (CSV) with the AI-adapted approach needed for PV systems. Traditional CSV assumes fixed logic, whereas AI demands performance-based validation:

Validation AspectTraditional CSV ApproachAI/ML-Adapted Approach
Requirements DefinitionFixed functional specifications (program does exactly X)Performance-based specifications with defined accuracy/recall thresholds (model must achieve e.g. ROC-AUC ≥ 0.90 on benchmark tasks)
Testing MethodologyScripted test cases for specific inputsStatistical testing on representative datasets; metrics (e.g. precision, recall) evaluated quantitatively
Change ManagementVersioned releases with full regression testingModel versioning with automated drift alerts; re-validation triggered by significant data shifts or architecture changes
Ongoing CompliancePeriodic configuration reviewContinuous performance monitoring with automated alerts; routine performance audits (e.g. annual retraining)
DocumentationDetailed test scripts/execution recordsModel cards (detailing architecture, training data), bias assessment reports, performance benchmarks, audit trails

Table 1. Adapting Computer System Validation for AI in Pharmacovigilance .

In practice, a PV organization might adapt its quality system as follows: instead of “functional specification documents”, it relies on model performance specifications (e.g. “The ML classifier must identify duplicated reports with ≥98% sensitivity and ≤5% false positive rate” ([48])). During testing, rather than running a few edge-case scenarios, the team evaluates the model on large hold-out datasets and statistical metrics (e.g. F1-score, ROC-AUC) . All changes to an AI system are managed like software, but with the added step that any model update undergoes re-validation: e.g. a monthly retraining on new ICSRs would require confirming that accuracy and bias metrics are still acceptable before it “goes live” .

Data Governance and Standards

AI’s effectiveness and trustworthiness depend on data. Governing the data lifecycle is as important as the models themselves ([56]). Key considerations include:

  • Data Quality and Representativeness: Ensure training and testing datasets are comprehensive and unbiased. For example, if historical PV data has gaps (some countries under-report, or certain age groups underrepresented), these gaps must be corrected or handled (by weighting or additional data collection) to prevent the model from learning spurious patterns.
  • Data Lineage and Versioning: Record where input data came from (e.g. which ICSR database versions, literature databases) and how it was preprocessed. This is crucial for traceability if later a signal is questioned. A data governance board should set policies for data version control and archiving.
  • Privacy and De-identification Controls: Raw PV data often contains personal health identifiers. Data used for model training must be de-identified according to regulations (HIPAA, GDPR, etc.). Any residual re-identification risk must be evaluated. For instance, a generative AI chatbot used on PV data should not reveal patient details inadvertently.
  • Standards Alignment: Use and enforce recognized coding standards (MedDRA for terms, E2B(R3)/XML for transmissions) to ensure AI systems can interoperate with existing PV tools ([58]). The CIOMS/Sakara guidance notes that AI designs should assume interoperability – e.g. models ingesting AE data must expect MedDRA-coded facets to align across sources.
  • Bias Audits: Regularly test models for performance across subgroups (age, gender, ethnicity, geography). If the model underperforms systematically for a subgroup, address it via data augmentation or algorithm adjustments ([51]) ([52]).

By instituting these data governance measures, organizations minimize the risk that a model will learn “garbage in.” Clean, ethical data are the foundation for any trustworthy AI in PV.

Organizational Action Items

To put principles into practice, experts recommend concrete steps. For example, SakaraDigital lists eight critical action items for life sciences companies deploying PV AI ([42]):

  1. Map Current PV Processes for AI Opportunities: Create a detailed inventory of PV workflows (from case intake to signal review) and identify where AI can fit. Focus on high-volume, routine steps (e.g. triage, duplicate checking) with clear human benchmarks ([42]).
  2. Establish Accountability Frameworks: Clearly assign responsibility for each AI component. Document who “owns” the model outputs, who is accountable for patient safety, and how oversight flows. The principle is that while AI assists, a qualified human must be the final responsible person.
  3. Develop AI-Specific Validation Protocols: As above, build a tailored validation plan that addresses ML peculiarities: data quality checks, performance metrics selection, bias evaluation, etc. This often means writing new SOPs or expanding existing CSV processes to include AI.
  4. Implement Robust Human Oversight: Design workflows with forced human checks at critical points. For example, an SOP might require a pharmacovigilance expert to review any signal flagged by AI before any regulatory report is submitted.
  5. Document AI Behavior Thoroughly: Maintain exhaustive documentation for regulators. This includes algorithm design documents, data used, performance results, and decision logs. Notably, “model cards” – concise descriptions of what a model does and its limitations – are increasingly recommended.
  6. Assess and Mitigate Biases: Incorporate fairness checks into every model’s lifecycle. This might involve algorithmic fairness tests, manual review of flagged cases for subgroup bias, or involving patient groups in model evaluation.
  7. Engage Regulatory Authorities Proactively: When planning to use AI in PV, inform regulators (EMA, FDA, PMDA, etc.) early. Many agencies appreciate transparency and may offer guidance on expectations. Joint workshops and sandboxes (e.g. FDA’s AI in clinical tools initiative) are emerging venues for dialogue.
  8. Plan for Regulatory Evolution: Build flexibility into the AI governance framework. Regulations like the EU AI Act and potential ICH/GMP updates are on the horizon. Organizations should monitor these developments and design their AI processes to adapt (e.g. by modular validation).

These steps, combined with strong leadership support, ensure that AI deployments in signal detection remain constructive. They also help meet CIOMS-specific calls (e.g. CIOMS expects continuous monitoring and regulatory engagement ([58])).

Case Study: Pilot Implementation of AI Triage

While detailed proprietary case studies are rare in public literature, composite scenarios can illustrate best practices. Consider a large pharmaceutical company's pilot of an ML-based ICSR triage system, following the above governance model:

  • Use-Case: Automatically prioritize incoming ICSRs by seriousness and expectedness, routing "low-risk" cases for standard handling and flagging potential expedited cases.
  • Data Preparation: The company consolidates its historical 5-year ICSR database, ensuring fields (e.g. event terms, seriousness criteria) are normalized. Any PHI is removed. SME (Subject Matter Expert) teams label a gold-standard subset of reports as "priority" or "non-priority" to train/validate models.
  • Model Development: A supervised ML classifier is trained to predict the SME labeling. The algorithm selected is a gradient boosting machine, chosen for interpretability and performance. The model explicitly outputs a risk score.
  • Validation: The ML model is evaluated on a hold-out set. It achieves 96% sensitivity and 90% specificity at a chosen threshold. These metrics meet pre-defined efficacy requirements (≥95% sensitivity) set by the AI governance committee.
  • Parallel Operations: For 3 months, new ICSRs are processed both by the old manual triage and the new AI system. The outputs are compared: the model successfully matched the human’s triage decisions 95% of the time, and in a few instances it caught a serious case that humans had missed (these are audited to refine model inputs).
  • Human Oversight: During pilot, even if AI marks a case as low priority, a second reviewer spot-checks a random sample. Any mismatches trigger review of the model.
  • Governance Reviews: At pilot initiation, the AI Steering Committee approved the project with governance framework. Monthly reviews are conducted to examine model drift (sudden changes in input patterns). All model changes are logged in the Model Risk Management function.
  • Outcomes: The company reports a ~65% reduction in median case processing time for the pilot batch ([9]). Based on consistent performance, the Steering Committee approves full-scale rollout of AI triage. Post-pilot, the system is integrated so that human case processors see the AI’s priority score next to each ICSR, aiding faster review.

In this (hypothetical) scenario, the company followed recommended steps: careful data governance, performance validation, and defined human checks. Citations support components of this story (e.g. the phased pilot approach ([13]) and efficiency gains ([9]) ([8])). Importantly, any model-flagged signal would undergo further human evaluation as per policy, exemplifying CIOMS’s insistence that humans remain “in control” ([45]).

Regulatory Frameworks and Standards

No discussion of AI governance would be complete without recognizing current regulatory stances. While formal regulations on AI in PV are still emerging, agencies have signaled key expectations through publications, pilots, and international collaborations.

  • CIOMS Working Group XIV (2025 report): As described, CIOMS WG XIV’s report is the first authoritative multi-stakeholder guideline specifically on AI in PV. Its principles (human oversight, explainability, etc.) carry significant weight because CIOMS has longstanding influence on PV practice globally ([1]) ([2]). Companies should align their AI strategies with this framework as an authoritative reference.
  • EMA/HMA: The European Medicines Agency and Heads of Medicines Agencies have been active. In 2025 they initiated a multi-stakeholder workshop on AI ([62]). Earlier, a draft reflection paper from EMA highlighted PV: it stated that marketing authorization holders should “validate, monitor and document” their AI/ML models as part of the PV system, especially for signal detection and adverse event surveillance ([15]). EMA’s approach emphasizes continuous oversight and integration of AI into existing Good Pharmacovigilance Practices (GVP). The new EU AI Act (enacted 2023, coming into force from 2024) will also imply strict controls on “high-risk” AI, which PV systems likely qualify as. This means stricter documentation, risk management and possibly third-party audits are legally required in the EU.
  • FDA (U.S.): The FDA has been forward-looking but cautious. It launched the Emerging Technologies Program at CBER (2006) and now the FDA’s CDER’s Emerging Drug Safety Technology Program (EDSTP) ([14]). The EDSTP collaborates with industry/academia to test AI tools (e.g. applying ML to FAERS for better signal detection). The FDA has not yet issued binding rules on AI in PV, but it observes projects and has indicated openness: it explicitly invites companies to engage on AI use-cases and warns that FDA inspections will check for validation of any AI used in regulated processes (e.g. ensuring algorithms meet GxP similar to other computerized systems).
  • World Health Organization (WHO): While WHO has not yet released PV-specific AI rules, it encourages the concept of “safety-nets” like CIOMS suggests. Also, WHO’s 2021 Special Programme on digital health signals interest in AI ethics. Future WHO tech reviews may cover PV.
  • Other regulators: Japan’s PMDA has an AI Action Plan (2025) focusing on agency use of AI; presumably it will also look at signals. Canada’s Health Canada and Australia’s TGA have published conceptual frameworks for AI in healthcare but not PV-specific guidelines yet. Nevertheless, they are part of global alignment efforts (e.g. in ICH or IMDRF) to ensure PPEINS (Pharmaco*, Patient, Ethics, Informatics, etc).

Implications for Compliance: While detailed AI rules are forthcoming, sponsors should anticipate that regulatory inspections will scrutinize AI use under existing frameworks. For instance, any AI pipeline used for signal detection will be considered part of the pharmacovigilance system. Under ICH E2E and GCP/GMP principles, a violation (e.g. missed signal due to model error) can lead to enforcement even if the issue arose from AI. Thus, adopting CIOMS/TransCelerate guidance now is prudent. It ensures that when regulators do define requirements (e.g. in an official guideline or law), companies already have the necessary documentation and controls in place.

Discussion and Future Directions

Implementing AI governance in PV signal detection is not a one-time project but an evolving journey. Here we discuss broader implications, challenges, and future trends.

  • Shifting PV Paradigm: As PVpharm and CIOMS envisage, AI could transition PV from a retrospective reporting discipline to a continuous, predictive science ([63]). For example, with robust AI, a safety team might monitor real-world signals in real-time (e.g. using hospital EHR feeds) rather than wait for case reports. Such a paradigm shift requires cultural as well as technical change: PV professionals will need AI literacy, focusing more on interpreting AI findings than on manual case processing ([64]).
  • Workforce and Training: CIOMS emphasized that PV experts must adapt. Training programs must now include data science fundamentals and ethics. Companies may hire data scientists or upskill existing staff. TransCelerate’s approach of combining bio-pharma domain experts with technical teams (e.g. as was done for IGR PV) will likely repeat in AI governance bodies.
  • Bias and Equity: Despite best efforts, AI may reveal biases. For example, if a model was trained mostly on adult clinical trial data, it may underdetection of pediatric signals. Governance must include ongoing bias audits. Transparency principles mean that companies should report known model limitations (e.g. “This signal detector is less reliable on drugs with fewer than 100 ICSRs”).
  • Patient Involvement: CIOMS, reflecting WHO values, encourages patient engagement. Future governance could involve patient representatives in reviewing AI tools (analogous to patient reps on safety committees). This aligns with a general shift toward “patients as partners” in PV, ensuring that patient rights and perspectives shape AI use.
  • Inter-company Collaboration: Federated learning (as mentioned) could be a game-changer for signal detection. However, it requires industry collaboration on standards. Consortia (like TransCelerate or new alliances) may sponsor federated PV platforms. Regulatory bodies might also facilitate anonymized data sharing frameworks.
  • Advanced Technologies: Beyond current ML/NLP, generative AI (large language models) will loom large. These can summarize vast literature or generate hypotheses from data, but also hallucinate. Governance must evolve for generative AI specifically. For example, openAI’s GPT might be used to draft PV narratives; oversight would require a new set of tests (e.g. verifying no hallucinated drugs).
  • Regulatory Harmonization: Expect initiatives like ICH to eventually address AI. Already, ICH’s eWG might expand scope beyond data harmonization to include AI tools (ICH M4 guideline series?). Harmonized global standards will ease trans-national pharmacovigilance. Until then, organizations will try to satisfy the strictest regime (likely EU’s) as a baseline.
  • Ethical and Legal Considerations: Beyond technical governance, there are bigger questions. If an AI misses a safety signal and patient harm results, liability falls on the MAH. Thus, legal frameworks may evolve to attribute risk. Ethically, the industry and regulators must balance innovation with the precautionary principle; governance frameworks embody this balance.
  • Continuous Improvement: Feedback loops will refine governance. For instance, after an AI use is audited (say, after an FDA inspection), lessons learned (gaps in documentation, unexpected biases) should revise the corporate policy. Formalizing these lessons (in SOPs or updated CIOMS guides) will make AI governance a living system.

Conclusion

The integration of AI into pharmacovigilance signal detection holds immense promise: accelerating identification of adverse reactions, expanding data horizons, and ultimately protecting patients more effectively ([6]) ([17]). However, this opportunity comes with responsibility. AI models, if unchecked, can introduce new risks that undermine trust in PV. Achieving the promise of AI requires explicit governance – a combination of principled guidelines and pragmatic execution.

This report has surveyed the landscape as of 2026: the CIOMS WG XIV report and industry programs like TransCelerate’s autonomous PV initiatives collectively provide a roadmap. Key themes emerged: Patient safety must remain paramount, with AI serving as a tool under human supervision; transparency and accountability guard against “black box” failures; and continuous validation ensures models do not drift or amplify biases. Organizations must thus institutionalize AI governance: assigning clear roles (Steering/QA committees, Model Risk owners), embedding validation into existing GxP quality systems, and nurturing a culture that scrutinizes rather than blindly trusts AI.

The initial focus will be on “low-hanging fruit” processes (e.g. ICSR triage, coding, literature screening) ([42]) ([3]). But in parallel, innovators envision ambitious futures: predictive safety analytics, real-time population monitoring, even “smart” medical devices that report signals. For each advance, the governance net must adjust: for example, a wearable device generating safety alerts would require IoT security protocols and new privacy rules. The key, as emphasized by CIOMS, is that holistic governance evolves with technology. Regulatory authorities and industry must continue collaborating so that guidelines keep pace with innovation.

In conclusion, implementing AI in PV is not an all-or-nothing gamble but a careful ascent. By following the principles and processes outlined by CIOMS and TransCelerate (and supported by academic research), pharmaceutical companies can steadily incorporate AI into their signal detection pipelines. This journey involves not only technical development but also organizational change and ethical vigilance. When done correctly, AI governance transforms signal detection from a reactive, siloed process into a proactive, integrated system – ultimately enhancing the safety net for all medicines and the patients who rely on them.

Key Takeaways: Pharmacovigilance signal detection is poised for an AI-led revolution. Evidence shows AI can dramatically improve detection speed and insight ([6]) ([37]). However, to reap these benefits, sponsors must enact robust governance: risk-based oversight, human-involvement, rigorous validation, and ethical data management are non-negotiables ([2]) ([12]). The confluence of TransCelerate’s industry guidance and CIOMS WG XIV’s global framework offers a comprehensive blueprint. By embracing these guidelines and building the requisite organizational structures, the PV community can deploy powerful AI tools without compromising patient safety – truly advancing drug safety in the digital age ([2]) ([42]).

External Sources (64)
Adrien Laurent

Need Expert Guidance on This Topic?

Let's discuss how IntuitionLabs can help you navigate the challenges covered in this article.

I'm Adrien Laurent, Founder & CEO of IntuitionLabs. With 25+ years of experience in enterprise software development, I specialize in creating custom AI solutions for the pharmaceutical and life science industries.

DISCLAIMER

The information contained in this document is provided for educational and informational purposes only. We make no representations or warranties of any kind, express or implied, about the completeness, accuracy, reliability, suitability, or availability of the information contained herein. Any reliance you place on such information is strictly at your own risk. In no event will IntuitionLabs.ai or its representatives be liable for any loss or damage including without limitation, indirect or consequential loss or damage, or any loss or damage whatsoever arising from the use of information presented in this document. This document may contain content generated with the assistance of artificial intelligence technologies. AI-generated content may contain errors, omissions, or inaccuracies. Readers are advised to independently verify any critical information before acting upon it. All product names, logos, brands, trademarks, and registered trademarks mentioned in this document are the property of their respective owners. All company, product, and service names used in this document are for identification purposes only. Use of these names, logos, trademarks, and brands does not imply endorsement by the respective trademark holders. IntuitionLabs.ai is an AI software development company specializing in helping life-science companies implement and leverage artificial intelligence solutions. Founded in 2023 by Adrien Laurent and based in San Jose, California. This document does not constitute professional or legal advice. For specific guidance related to your business needs, please consult with appropriate qualified professionals.

Related Articles

Need help with AI?

© 2026 IntuitionLabs. All rights reserved.