Introducing Generative AI in Regulatory Affairs

Indegene

/@Indegeneinc

Published: June 29, 2023

Open in YouTube
Insights

This video provides an in-depth exploration of the potential and practical applications of Generative AI within Regulatory Affairs in the life sciences industry. Hosted by Indegene, the webinar features insights from regulatory leaders at AstraZeneca and Pfizer, alongside Indegene's own experts, who discuss how this rapidly evolving technology can challenge conventional practices and design the future of work in a highly regulated environment. The discussion establishes a balanced perspective on generative AI, moving beyond the hype to understand its definitions, various models (GPTs), and specific applicability in areas like regulatory intelligence, submissions, and content management.

The panel delves into critical considerations for adopting generative AI, emphasizing the need for an agile approach given the technology's rapid evolution. They highlight the importance of starting small, learning, and being prepared to adapt strategies over time. Key business problems that could benefit from generative AI include document and report generation, health authority query responses, strategy and submission planning, generating procedures, and quality control/validation activities. The speakers stress the concept of "augmented intelligence," viewing AI as an assistant to decision-makers rather than a replacement, and underscore the necessity of building trust and explainability (XAI) in AI outputs, drawing parallels to the adoption of electronic signatures.

Several practical use cases are demonstrated, showcasing generative AI's capability to transform unstructured and structured data into actionable insights and compliant documentation. These include the summarization of clinical trial data from tabular to text format, the automated generation of Informed Consent Forms (ICFs) from protocol documents (including multi-language translation), querying document management systems for specific answers and insights, and the precise extraction of entities like Adverse Events from complex medical texts. The demonstrations highlight the role of prompt engineering and additional coding in achieving accurate and contextually relevant outputs, while also addressing challenges such as ensuring quality, consistency, data privacy, and mitigating data bias and "hallucinations" in a regulated setting.

The discussion also covers the critical challenges and considerations for successful implementation, including ensuring quality and consistency of outputs, building trust and explainability, managing infrastructure and data security (especially patient data), addressing data bias, and maintaining regulatory compliance. The panelists provide actionable advice on organizational readiness, advocating for awareness sessions, controlled experimentation with small groups, and establishing continuous learning cycles where human feedback refines model accuracy. They also emphasize the need for close collaboration between technology, business, and compliance teams to navigate the complexities of deploying AI in a regulated industry, and the potential for industry-wide data pooling to address data scarcity for certain use cases.

Key Takeaways:

  • Generative AI's Role in Regulatory Affairs: Generative AI is poised to transform Regulatory Affairs by optimizing processes like content authoring, regulatory intelligence, submission planning, and compliance tracking, moving beyond traditional NLP capabilities.
  • Agile Implementation is Crucial: Given the rapid evolution of generative AI technology, organizations must adopt an agile methodology, starting with small-scale experiments, continuously learning, and being prepared to pivot strategies as new capabilities emerge.
  • Focus on Augmented Intelligence: Initially, generative AI should be viewed as an "augmented intelligence" tool, assisting human decision-makers to speed up processes and improve decision quality, rather than fully automating critical regulatory functions.
  • Building Trust and Explainability (XAI): A significant challenge is establishing trust in AI-generated outputs, which requires explainable AI (XAI) capabilities to understand how conclusions are reached, and implementing "trust but verify" principles, especially in regulated environments.
  • Strategic Use Case Selection: Prioritize low-risk, medium-value use cases that improve internal operational efficiencies and have a human-in-the-loop for verification, before deploying outputs directly to regulators. Examples include literature surveillance and entity extraction.
  • Importance of Data Quality and Availability: The effectiveness of generative AI models is directly tied to the quality and volume of data they are trained on; access to large, high-quality datasets is paramount for better outcomes.
  • Prompt Engineering and Customization: Achieving accurate and contextually relevant outputs from generative AI models requires extensive prompt engineering and additional coding, allowing professionals to tailor instructions and leverage specific models for fit-for-purpose applications.
  • Demonstrated Use Cases:
    • Clinical Trial Data Summarization: Converting complex tabular data (e.g., from clinicaltrials.gov) into patient-friendly lay summaries and scientifically accurate physician summaries, with readability scores (Flesh Reading Ease Score) indicating clarity.
    • Informed Consent Form (ICF) Generation: Automating the creation of ICFs from protocol documents, accurately extracting study purpose, patient numbers, and drug administration details, with the ability to translate into multiple languages and assess cosine similarity for accuracy.
    • Document Querying and Insight Extraction: Developing in-house tools to summarize long documents (e.g., regulatory guidance) into concise summaries with clickable keywords, and enabling users to query multiple documents to receive sourced answers with page numbers for explainability.
    • Adverse Event (AE) Entity Extraction: Accurately extracting specific Adverse Events, onset dates, medications, indications, and other medical events from unstructured text, converting it into structured, tabular data for database integration.
  • Addressing Implementation Concerns: Organizations must proactively address concerns around data privacy and security (especially patient data), infrastructure requirements (e.g., cloud environments, firewalls), data bias, and ensuring outputs comply with regulatory requirements (e.g., GxP, 21 CFR Part 11).
  • Organizational Readiness and Continuous Learning: Fostering organizational readiness through awareness sessions, providing guidelines for use, and enabling controlled experimentation are vital. Implementing continuous learning cycles with human feedback will constantly improve model accuracy and maintain performance.
  • Cross-Functional Collaboration: Close collaboration between technology, business, and compliance teams is essential for successful adoption, as generative AI touches all these areas and requires integrated consideration.

Tools/Resources Mentioned:

  • ChatGPT (GPT-3, GPT-4)
  • Google Bard AI
  • LangChain framework (used for querying document chunks)
  • ClinicalTrials.gov (data source for demo)

Key Concepts:

  • Generative AI: A type of artificial intelligence capable of generating new content, such as text, images, video, or code, from vast amounts of training data.
  • Large Language Models (LLMs): AI models trained on massive text datasets to understand and generate human-like text, forming the foundation for many generative AI applications.
  • Generative Pre-trained Transformer (GPT): A specific type of LLM architecture that uses transformer networks for processing sequential data, known for its ability to generate coherent and contextually relevant text.
  • Prompt Engineering: The art and science of crafting effective input prompts for generative AI models to guide them toward desired outputs.
  • Hallucinations: Instances where generative AI models produce outputs that are factually incorrect, nonsensical, or not grounded in the training data.
  • Flesh Reading Ease Score: A readability formula that measures the difficulty of written text, with scores indicating how easy or difficult a document is to read (e.g., 60+ for day-to-day English, 80+ for conversational).
  • Cosine Similarity: A measure of similarity between two non-zero vectors of an inner product space, often used to determine how similar two documents or texts are.
  • Explainable AI (XAI): AI systems that can explain their reasoning and decision-making processes in an understandable way to humans, crucial for building trust and ensuring compliance in regulated industries.
  • Augmented Intelligence: An approach to AI that focuses on enhancing human capabilities and decision-making rather than replacing them.

Examples/Case Studies:

  • Clinical Trial Data Summarization: Demonstrated by converting a detailed clinicaltrials.gov dataset into both a layperson's summary (with a Flesh Reading Ease Score of 75.6) and a scientific physician's summary.
  • Informed Consent Form (ICF) Generation: Showcased the automated creation of an ICF from a protocol document, including accurate extraction of study details and the ability to generate a Spanish version with high cosine similarity to the original.
  • Document Management System (DMS) Querying: An in-house tool was demonstrated, allowing users to upload and query multiple regulatory documents (e.g., "Clinical Drug Interaction Studies Guidance") to extract specific answers, with sources and page numbers provided for explainability.
  • Adverse Event (AE) Entity Extraction: Illustrated the ability to extract specific Adverse Events (e.g., dizziness, shortness of breath, chest pain), onset dates, medications, and other medical details from unstructured patient reports into a structured, tabular format.