QMS for AI/ML Medical Devices: Regulatory & Design Guide

Executive Summary
This report provides a comprehensive analysis of how to build and maintain a robust Quality Management System (QMS) specifically for medical devices incorporating artificial intelligence (AI) and machine learning (ML) technologies. AI/ML-based medical devices (often considered Software as a Medical Device, SaMD) pose unique challenges that require adapting traditional medical device QMS frameworks. We review the historical context of quality management in medical device regulation, identify the current regulatory landscape for AI/ML medical devices, and examine the components of a QMS that must be tailored to AI/ML. Key findings include:
-
AI/ML growth in medical devices: The FDA has cleared hundreds of AI/ML-enabled devices (over 878 to date) for marketing ([1]). AI is pervasive in imaging, diagnostics, monitoring, and decision support. However, these systems introduce new risks – for example, recent studies show that ~6.3% of FDA-cleared AI devices had recalls, often due to algorithmic or software issues ([2]) ([3]).
-
Regulatory and standards guidance: Traditional medical device QMS requirements (e.g. ISO 13485, FDA’s 21 CFR 820, now the Quality Management System Regulation QMSR) still apply to AI/ML devices, but regulators are issuing additional guidance. The FDA and other agencies have proposed frameworks (e.g. the Predetermined Change Control Plan for updates) and published “Good Machine Learning Practice” principles for AI medical devices ([4]) ([5]). In the EU, the new AI Act explicitly mandates that high-risk AI providers implement a documented QMS (Article 17) ([6]). International efforts (IMDRF, MHRA, Health Canada) stress harmonized approaches to QMS and change control for adaptive AI devices ([4]) ([7]).
-
QMS structure and adaptation: At the foundation, manufacturers must maintain a QMS (according to ISO 13485/21 CFR 820) that covers document control, design/development controls, risk management, production controls, CAPA, supplier management, and post-market surveillance. Each of these elements requires AI-specific enhancements. For example, “design control” processes must incorporate data collection and model training as design inputs, and “design outputs” include not only software code but also trained model artifacts and validation results ([8]) ([9]). Risk management (ISO 14971) must explicitly consider AI-driven hazards such as data bias, adversarial inputs, model drift, and incorrect automation. Procedures must be documented for how models are trained, tested, validated, and updated, preserving traceability and reproducibility ([9]) ([10]).
-
Data and validation: A critical QMS element is controlling the data lifecycle. High-quality, representative, and well-annotated training data are crucial for safe AI devices. The system must document data sources, preprocessing, labeling protocols, and uses. The QMS should enforce strong verification and validation (V&V) – not just unit and integration tests of software, but independent evaluation of model performance on held-out and real-world data. Recent analyses found that most FDA-cleared AI devices lack comprehensive reporting of their training data, demographics, and trial designs ([11]), underscoring the need for rigorous data management in the QMS. Moreover, continuous monitoring (post-market surveillance) must detect when model performance degrades or when new use-cases arise.
-
Change management: A QMS for AI/ML devices must handle frequent and planned changes more flexibly than traditional software. FDA has introduced the concept of Predetermined Change Control Plans (PCCPs) to allow approved pathways for model updates (e.g. model retraining on new data) under pre-specified risk controls ([4]) ([12]). In practice, this means strict version control, change protocols, and documentation for any model update. The QMS must ensure each change is assessed for safety and efficacy before deployment.
-
Case studies and data: Real-world evidence illustrates these needs. Several studies highlight troubles when AI devices lack QMS rigor. One study found that AI-enabled device recalls were disproportionately due to design/software issues, suggesting gaps in development practices ([3]). Another showed that devices without adequate clinical validation experienced more and larger recalls ([2]). These and other analyses reinforce that robust QMS processes (e.g. proactive verification, redundancy, user training) could mitigate such failures.
-
Future directions: Regulatory bodies worldwide are responding. In the US, FDA has upgraded 21 CFR 820 to the “Quality Management System Regulation” (QMSR, effective Feb 2026) ([13]), aligning more closely with ISO concepts and emphasizing risk-based approaches. Globally, ISO has released ISO/IEC 42001:2023 for AI management systems ([14]). The EU AI Act and other AI governance efforts will further formalize QMS expectations for AI. Manufacturers must stay agile in updating their QMS, and might also leverage digital QMS tools with AI capabilities to manage the complexity.
In sum, integrating AI/ML into medical devices demands that every aspect of the QMS be re-examined and augmented for the data-driven, adaptive nature of these products. This report provides a detailed roadmap: from understanding regulatory contexts to elaborating on data management, risk control, and validation processes tailored for AI/ML, including case examples and best practices. Each claim is supported by current regulatory documents, scientific studies, and expert analyses to guide device makers, regulators, and healthcare institutions in building or evolving their QMS for AI/ML medical devices.
Introduction
The advent of Artificial Intelligence (AI) and Machine Learning (ML) in healthcare is transforming medical devices across diagnostics, monitoring, and treatment. Traditional medical equipment, such as imaging systems or physiological monitors, are now routinely augmented with AI/ML algorithms that analyze data to assist clinical decisions. Unlike conventional software, AI/ML systems can learn from data and adapt over time (either continuously in-field or via periodic updates). This iterative, data-driven nature creates unique challenges for ensuring device safety and effectiveness.
A Quality Management System (QMS) is the organizational framework that ensures medical devices are designed, produced, and maintained under control, meeting regulatory requirements and performance standards. Established QMS principles (e.g. ISO 13485:2016 and FDA’s 21 CFR 820) emphasize documentation, process controls, risk management, and continuous improvement. These principles apply to all medical devices, including software as a medical device (SaMD). However, AI/ML devices introduce new domains of risk and complexity that traditional QMS processes must explicitly address. For example, a diagnostic AI model’s performance critically depends on the data used to train it; errors in data or drift in clinical practice could lead to patient harm unless proactively managed.
This report examines in depth how to build and operate a QMS tailored for AI/ML medical devices. We first recount the background: the evolution of medical device quality regulation and the emergence of AI in healthcare. We then survey the regulatory and standards landscape, including recent FDA and EU initiatives on AI/ML devices. Next, we dissect QMS components (design controls, risk management, document control, etc.) through the lens of AI/ML, highlighting specific practices for data lifecycle management, algorithm verification/validation, change control, and post-market monitoring. Empirical data and case examples illustrate where gaps have led to device recalls or safety concerns, underscoring the stakes. Finally, we discuss future implications as AI-specific standards and guidance mature, and offer recommendations for integrating AI considerations into quality culture.
This treatise aims to be authoritative and comprehensive, with extensive citations to regulatory documents, scientific literature, and expert guidance. By providing detailed, evidence-based analysis of each aspect of the QMS, this report serves as a guide for medical device manufacturers, regulators, and healthcare leaders to understand and implement quality assurance for AI/ML-enabled products. The ultimate goal is to ensure these innovative devices deliver their intended benefits safely and effectively to patients, within a robust quality framework.
Background and Context
Quality Management Systems in Medical Device Regulation
A Quality Management System (QMS) in the medical device industry is a structured set of policies, processes, and procedures that organizations use to ensure their products consistently meet customer and regulatory requirements. Foundational standards such as ISO 13485:2016 (Medical devices—Quality management systems) explicitly define requirements for such a system in the context of quality assurance, production, and post-market monitoring of medical devices. The FDA’s 21 CFR 820 (Quality System Regulation, or QSR) similarly mandates requirements such as design controls, process controls, and corrective action processes for manufacturers. (Note: as of Feb 2, 2026 the FDA has renamed Part 820 as the “Quality Management System Regulation” or QMSR ([13]) while largely retaining the same elements under a more risk-based approach, aligning with ISO principles.)
The elements of a typical medical device QMS include:
- Management Responsibility: Leadership commitment, quality policy, organizational structure.
- Document and Record Controls: Procedures for creating, approving, updating, and retaining documents (design history files, manufacturing records, etc.).
- Design and Development Controls: Defined procedures for design inputs, outputs, verification, and validation.
- Production and Process Controls: Ensuring manufacturing processes are controlled and validated.
- Purchase/Supplier Controls: Qualified suppliers, incoming quality checks.
- Risk Management: Application of ISO 14971 to identify hazards and implement mitigations.
- CAPA (Corrective and Preventive Action): Identifying and addressing nonconformities and potential issues.
- Internal Audits and Management Review: Ongoing monitoring and improvement of the QMS itself.
This QMS framework has historically ensured that devices—from simple mechanical implants to complex electronics—are safe and effective. It enforces traceability (e.g. material to final product), accountability, and a culture of continuous improvement. Quality records (like design history files and device master records) must be maintained and are frequently audited by regulators.
Evolution of AI/ML in Medical Devices
Artificial Intelligence and Machine Learning refer to software systems that can learn patterns from data and make predictions or decisions. In healthcare, notable examples include image analysis algorithms (for radiology, pathology, or ophthalmology), seizure or arrhythmia detectors from biosignals, predictive models for patient deterioration, and clinical decision support tools. Early adopters like IDx-DR (a diabetic retinopathy screening AI system cleared in 2018) demonstrated that fully autonomous AI devices could attain regulatory approval ([15]). Since then, adoption has accelerated: by mid-2025 the FDA had cleared 878 AI/ML-enabled devices ([1]), spanning fields such as radiology, cardiology, pathology, electrocardiography, and more. These devices leverage data-driven algorithms to perform tasks that traditionally required human interpretation.
However, AI/ML devices differ fundamentally from fixed-function devices:
- Data-Dependence: The performance of an AI system is tied to its training data. Biased or limited data can lead to poor performance in certain populations.
- Adaptive Nature: Some AI systems are designed to learn and improve over time (e.g. by incorporating new patient data), which blurs the line between device and process. Traditional QMS assumed validated software is “locked” during use, but AI challenges that assumption.
- Complexity and Opacity: Many AI models (e.g. deep neural networks) are “black boxes” with limited interpretability. This increases uncertainty about how they make decisions.
- Lifecycle Changes: AI components may require frequent retraining, updates, or calibration, introducing new change-control challenges beyond typical software patches.
These characteristics raise safety and effectiveness issues that a QMS must address explicitly. For example, as one review noted, AI systems must cope with “errors or faulty input” in order to gain clinical trust ([16]). Ensuring robustness to real-world conditions and data shifts is essential. If an AI diagnostic tool performs well initially but degrades as practice patterns evolve, regulators and manufacturers need QMS provisions to catch and manage that drift.
Quality Management for AI/ML: New Considerations
The notion of a “Quality Management System for AI/ML” is gaining attention both in industry and academia. A recent perspective argues that QMS principles should be integrated throughout the AI development life cycle – from initial research through deployment and routine use – to “close the AI translation gap” and facilitate safe, ethical implementation ([17]) ([10]). In practice, this means adapting standard quality processes to cover AI-specific elements:
- Augmented Design Control: Design inputs now include data requirements and algorithm specifications; outputs include trained models and performance metrics.
- Extended Risk Management: In addition to physical harm or software malfunction, risk analysis must consider biases, adversarial attacks, and degradation over time.
- Traceability for Data: Datasets themselves become controlled documents: versioned, vetted, and traceable.
- Validation Beyond Code: Verification must not only check software logic but assess statistical performance (sensitivity, specificity) on validation datasets or clinical trials.
- Change Management: QMS must accommodate model versioning. FDA’s new Predetermined Change Control Plans (PCCPs) require upfront definition of the scope and safeguards for planned model changes ([4]) ([12]).
- Cybersecurity Considerations: AI devices (often cloud-connected or data-intensive) must embed security risk management into the QMS (e.g. following the FDA’s new cybersecurity guidance) ([18]).
- Post-Market Monitoring: The QMS should include continuous monitoring of real-world performance (and adverse events) of AI algorithms, triggering CAPA processes if thresholds fall outside approval claims.
Adopting these AI/QMS practices ensures that new complex threats are managed systematically, maintaining device reliability. As one commentary emphasizes, a QMS “documents processes, procedures, and responsibilities to achieve quality objectives,” and this structure can manage evolving regulatory requirements and ensure adherence to cutting-edge standards over the life cycle of AI/ML medical software ([19]). In short, AI device QMS integrates traditional device quality assurance with data science best practices, bridging technical and clinical domains.
Regulatory Landscape for AI/ML Medical Devices and QMS Requirements
Medical devices with AI/ML components fall under existing regulatory regimes for medical devices, but regulators are actively updating guidance to address AI’s peculiarities. This section summarizes the key frameworks and how they relate to QMS.
United States (FDA)
In the U.S., manufacturers of medical devices (including AI/ML software) must comply with FDA regulations. Traditionally, 21 CFR 820 (the Quality System Regulation, QSR) governed device manufacture; effective February 2, 2026, FDA renamed and overhauled this as the Quality Management System Regulation (QMSR) ([13]). The new QMSR emphasizes a risk-based process approach (similar to ISO 13485) and encourages modern terminology, but largely retains design control, CAPA, and other QMS requirements. Manufacturers certifying to ISO 13485 often align well with QMSR requirements. Importantly, the QMSR and FDA guidance do not exempt AI/ML products; any AI-enabled device still needs a certified QMS and must fulfill design control and risk management obligations under Part 4 (the new QMSR) ([8]).
Beyond baseline QMS rules, FDA has issued AI-specific guidance:
-
SaMD Guidance and Change Control: In 2019-2021, FDA proposed a framework for modifications to AI/ML-based SaMD, introducing the concept of a Predetermined Change Control Plan (PCCP) ([4]). The idea is that manufacturers pre-specify in their submissions how their AI algorithm may evolve (for example, retraining methods and frequency) along with the risk controls in place. In Oct 2023, FDA (collaborating with Health Canada and MHRA) published Guiding Principles for PCCPs ([4]) ([12]). These principles emphasize that AI devices should have focused, risk-based, evidence-based, and transparent change management across the total product lifecycle ([12]). The principles supplement FDA’s expectation to integrate PCCPs into the QMS documentation – e.g. change-control procedures and validation plans for updates.
-
Good Machine Learning Practices (GMLP): The FDA and IMDRF have published guiding principles on GMLP for medical devices. These articulate best practices (much like “Good Clinical Practice” for trials) for data management, model training, and validation to ensure safe AI device development. GMLP is not legally binding but FDA encourages adherence to them. For example, one principle stresses the need for “reproducibility” (ensuring results are replicable) and thorough documentation of datasets and model versions ([20]). Embedding GMLP concepts into the QMS means formalizing procedures for documenting model development and for continuous verification.
-
Clinical Evidence Requirements: The FDA has noted concerns that many AI devices are cleared with limited validation. A 2025 study found that among 691 FDA-cleared AI devices, less than half reported key elements of validation (e.g. study design, sample size, demographics) and very few included prospective trial data; only 28% had documented premarket safety assessment in their summaries ([11]). This gap implies that QMS processes should enforce more rigorous clinical/technical evaluation before clearance. FDA reviewers now often scrutinize AI performance statistics and may request additional validation under the QMS’s design verification and validation activities.
-
Cybersecurity Guidance: In December 2022, a U.S. law mandated stronger cybersecurity for medical devices. The FDA’s February 2026 final guidance explicitly addresses "Cybersecurity in Medical Devices: QMS Considerations" ([18]). This guidance directs manufacturers to integrate cybersecurity risk management into their QMS (e.g. via procedures for vulnerability management and secure software updates). Because AI/ML systems often use cloud or networked components and handle sensitive data, FDA expects the QMS to document cryptographic protections, network security, and incident response plans alongside traditional quality controls.
-
Regulated Device Software Standards: In practice, FDA requires compliance with relevant consensus standards. For software development, IEC 62304 (medical device software lifecycle standard) is expected. IEC 62304:2006/AMD 1:2015 outlines development processes and lifecycle documentation. Similarly, IEC 82304-1:2016 covers general safety of health software products. A reference framework for agile development (AAMI TIR45) is often used to interpret these standards for AI/ML development ([8]). The point is that AI/ML software must still go through a controlled software development lifecycle; agile practices are allowed but must be mapped to regulatory requirements (e.g. user needs defined, design specs established, verification traced) ([9]).
Taken together, the U.S. market demands a QMS where model development and updates are fully documented, risk-managed, and validated. The QMS must integrate new FDA expectations for change management (PCCPs), leverage GMLP, and incorporate cybersecurity controls. Failure to do so can delay approval or result in recalls. Importantly, FDA audits (now using QMSR) will inspect AI/ML device records (design history files, validation reports) similarly to other devices, but with extra attention to systematic training and evaluation evidence.
European Union
In the EU, medical devices must comply with the Medical Device Regulation (MDR 2017/745) (and In Vitro Diagnostic Regulation IVDR). ISO 13485:2016 certification is required for CE marking, so manufacturers of AI/ML devices must run a compliant QMS. MDR places general provisions on QMS (Annex IX details special QMS requirements). Crucially, the forthcoming EU Artificial Intelligence Act (set to apply from August 2, 2026) adds new layers. The AI Act classifies certain medical AI systems as “high-risk AI,” which triggers explicit QMS obligations:
-
EU AI Act, Article 17: States that providers of high-risk AI systems “shall put a quality management system in place that ensures compliance with this Regulation.” The QMS must be documented with written policies and procedures, covering regulatory strategy, design controls, development processes, and validation procedures ([6]). In other words, it enshrines in law the need for manufacturers to formalize QMS processes specifically tailored to AI. A MedDeviceOnline analysis notes that Article 17 and related MDR clauses mean that EU regulators will expect AI-based devices to have documented change management, design verification, and risk controls as part of the QMS ([7]).
-
MDR Annex IX (EU): The EU MDR’s Annex IX requires manufacturers (or their authorized reps) to maintain a QMS that covers all aspects of design, production, and post-market. Recently it has been clarified that Annex IX explicitly applies quality controls for devices using AI in their functionality ([21]). This means EU Notified Bodies will audit AI-specific items—such as procedures for dataset management and model validation—as part of the manufacturer’s QMS inspection scope.
-
Standards and Guidance: Aligning with the US, health authorities in Europe reference standards like IEC 62304 and ISO 14971 for software risk management. Additionally, organizations like ALTAI and IEEE are developing guidelines on trustworthy AI, and the Assuring Autonomy International group (University of York) published AMLAS (Assurance of Machine Learning for Autonomous Systems). AMLAS is not mandatory, but British regulatory bodies (NHS Digital’s DCB0129 code) suggest it as best practice in healthcare AI safety. A QMS may adopt AMLAS steps (0–10) within its design control and validation sections to produce a safety case for an AI device ([22]).
-
Other Countries: Similarly, jurisdictions like the UK (MHRA), Japan (PMDA), and Canada (Health Canada) are active. The UK’s MHRA requires clinical safety and may reference the DCB0129 standard for ML within the health system. Canada jointly co-authored the PCCP guidance with FDA. Globally, regulators emphasize adapting existing frameworks (e.g. referencing software standards and ISO 14971) to AI/ML specifics ([23]) ([24]).
Table 1 (below) summarizes key regulations and standards relevant to AI/ML medical device QMS in various regions. It highlights that in all regions, an ISO 13485-aligned QMS is foundational, but AI-specific guidelines (like the EU AI Act and FDA PCCP guidance) must be layered on.
| Region / Regulatory Body | Medical Device QMS Requirement | AI/ML-Specific Guidance | Key Standards / References |
|---|---|---|---|
| USA (FDA) | 21 CFR 820 (QSR), now QMSR (effective 2026) – requires documented design controls, CAPA, etc ([13]). | - FDA SaMD guidance (2019+), Predetermined Change Control Plans (PCCP) guidance (2023) for AI updates ([4]). - Good Machine Learning Practices (Principles) encouraging data and model documentation. - Postmarket surveillance and cybersecurity guidance (2026) for AI devices ([18]). | IEC 62304 (medical software lifecycle) ([8]); ISO 14971 (risk mgmt); IEC 82304-1 (health software safety); AAMI TIR45 (Agile MD software) ([9]). |
| EU (MDR) | EU MDR 2017/745: QMS per Annex IX (ISO 13485-aligned) required. Manufacturers must have documented QMS covering all design and production steps ([7]). | - EU AI Act (2021) Article 17: mandates QMS for high-risk AI systems, covering compliance strategy, design/dev controls, validations ([6]). - MDCG guidelines on AI/ML and SaMD under development. - Postmarket & Vigilance rules for software updates. | ISO 13485:2016; IEC 62304; ISO 14971; IEC 82304-1; (Proposed ISO 42001:2023 for AI management) ([14]); EN ISO standards for software/MD. |
| UK (MHRA) | UK MDR (similar to EU MDR). ISO 13485 QMS required for CE/UKCA marking. QMS must encompass software design, risk, validation. | - DCB0129 (NHS code) & DCB0160 (safety cases) guidance for AI/ML in healthcare logistics. - MHRA draft guidance on software/hardware (e.g. recognition of AI changes). - Alignment with FDA’s PCCP through MHRA collaboration. | ISO 13485; IEC 62304; ISO 14971; DCB0129 (NHS) – not a standard but suggests ML-specific risk management. |
| International (Regulatory Forum) | IMDRF SaMD guidelines, ISO 13485:2016 recommended for all medical device QMS. | - IMDRF Good Machine Learning Principles (in development as of 2024). - WHO Al Guidance for ethics (not enforceable, but relevant for 'quality' of AI development). - ISO/IEC 42001:2023 defines an AI Management System standard (guidance for QMS of AI) ([14]). | IMDRF SaMD WG documents; ISO/IEC 42001:2023 (AI management system) ([14]); AAMI/ANSI reports on AI. |
Table 1: Regulatory requirements and standards relevant to AI/ML medical device QMS. (Sources: FDA, EU AI Act, standard documents ([4]) ([6]) ([14])).
As Table 1 shows, all authorities require a QMS (often ISO 13485-based) as the backbone. However, in addition the AI/ML context, regulators are articulating new recommendations. In the EU, for example, MedDeviceOnline notes that ISO 13485 remains the foundation for quality, but now the EU AI Act and ISO/IEC 42001 set AI-specific expectations ([7]). Any manufacturer of an AI medical device should thus ensure their QMS not only meets the basic regulations, but also incorporates procedures addressing AI aspects (e.g. data governance, model validation protocols, change protocols).
Components of an AI/ML Medical Device QMS
This section details the specific elements of a quality management system that are especially pertinent for AI/ML medical devices. We discuss how each conventional QMS process area must be extended or adapted to handle AI/ML characteristics, with supporting citations.
Quality Policy and Objectives
At the top of the QMS, the quality policy should explicitly recognize the role of AI in the organization’s products. As one expert advises, organizations should define AI-related commitments in their quality policy to guide process development ([25]). For instance, a manufacturer might state a policy to ensure that all AI algorithms are validated on representative clinical data and routinely performance-monitored. Aligning quality objectives (e.g. accuracy, uptime targets) with AI product characteristics helps set measurable targets.
Management must assign responsibilities related to AI quality. This includes deciding who manages data governance, who reviews model audits, and who handles cybersecurity of AI components. (ISO 13485 requires defining roles in the quality manual.) Top management reviews may need to include items like algorithm performance metrics, updates issued, and field data trends.
Document and Data Control
A robust QMS requires strict document control. For AI devices, the scope of controlled documents expands significantly. Standard QMS documents (procedures, work instructions) must be augmented by:
- Data Records and Specifications: Detailed documentation of all datasets used for training, validation, and testing. This includes data origin, inclusion/exclusion criteria, and how data is labeled or annotated. The QMS should control dataset versions (e.g. via a data management plan) and ensure only authorized, validated datasets feed into model training ([11]).
- Model Artifacts: The trained AI models (weights, logic) themselves should be treated as release items. A design history file (DHF) entry could include the model architecture, hyperparameters, and training code. Each model version should be uniquely identified and traceable to the software version and dataset used. This parallels how hardware devices record specification revisions in the device master record.
- Software Code: Source code for the AI algorithm or pipelines must be under version control, with change logs. Standard code management and review procedures (peer review, static analysis) remain vital. Any code used for processing medical data or computing outcomes must be treated as "device software" and meet documentation requirements.
8fold Governance advises that new forms of records may be needed for “transparency and explainability” of AI systems ([26]). For example, manufacturers might generate model cards or datasheets for datasets, summarizing model purpose, training data stats, and performance. These can be QMS records demonstrating what was considered in development and can aid regulators or users in understanding the AI’s basis.
Practically, the QMS should require periodic review of these documents. Given the rapid evolution of AI, one should shorten revision cycles for manuals and procedures related to AI (to avoid outdated guidance) ([27]). For instance, as new AI standards or security practices emerge, the QMS documents must be updated accordingly.
Design and Development Controls
In FDA/ISO terms, design controls cover translating user needs into system requirements, and ultimately into a validated product. For AI/ML medical devices, this requires special attention:
- User Needs & Intended Use: The initial design control inputs must capture not only the clinical function but also the data domain. For example, a chest X-ray AI might need clarified which imaging equipment and patient demographics it targets. The intended use statement should specify that AI is involved, and define how its outputs will be used clinically.
- Design Inputs: Along with clinical requirements, include data-related requirements: What types of data should the AI handle? What accuracy, sensitivity, and specificity thresholds are needed? How should the AI behave if encountering out-of-distribution data? Also consider regulatory constraints on the AI (e.g. explainability or security requirements). These should be documented in the design plan.
- Design Outputs: The system architecture should reflect the ML workflow. Outputs include not only software source code but also data preprocessing algorithms, the trained model files, and acceptance test procedures for model performance. All outputs must be reviewed for traceability back to inputs.
Even if development uses agile methods, FDA mandates that core design documentation be finalized and placed under change control before too many sprints occur ([9]). For example, initial system-level requirements and high-level architecture documents should be baselined early, then updated only through formal official changes. Lower-level details (e.g. internal model structure) may evolve faster, but the QMS must maintain records linking all changes to higher-level requirements. In practice, companies often maintain a traceability matrix that ties user needs to system requirements, to design specifications, to verification tests – now extended to include AI metrics and data traceability.
The design control process as applied to AI/ML devices has been discussed in literature. For example, Henry and Thiel (2022) describe how FDA design controls (21 CFR 820.30) apply to an “AI/ML lifecycle” and show how traditional inputs/outputs must be expanded for AI systems ([9]). Key points include:
- Even with agile development, initial user needs and high-level requirements should be placed under controlled revision so that changes are deliberate ([28]).
- Iterative sprints can produce intermediate builds and user stories, but each completed increment of code ties back to a design specification in the DHF.
- Design verification for AI includes performance testing on reference data. The “definition of done” for each development sprint may include passing certain unit tests and evaluation metrics ([29]).
- The QMS must capture the entire development pathway, often illustrated by an iterative cycle where model performance is repeatedly measured and improved.
In summary, design controls for AI devices mean tightly integrating software engineering best practices with the QMS. All ML model development must be planned, documented, and verified as part of the design history file, just like any other medical device component.
Risk Management (ISO 14971)
Risk management is central to medical device QMS. ISO 14971:2019 requires identifying hazards, estimating and evaluating associated risks, implementing control measures, and monitoring effectiveness. For AI/ML devices, some unique hazards arise:
- Data Bias and Representativeness: If training data lack diversity (e.g. racial or demographic bias), the model may perform poorly on certain patient groups, leading to misdiagnosis. This hazard (decision bias) should be documented in risk analysis. The QMS risk procedure must include evaluating data sufficiency and demographic coverage. A risk control might be to incorporate a broader training set or to flag referrals on higher-risk populations.
- Algorithmic Errors: Unlike deterministic software bugs, ML models can make unpredictable errors with low-frequency data. For example, an image classification AI might misinterpret rare conditions. Treat these errors as “software fault” hazards and evaluate their severity (e.g. false negative cancer diagnosis could be severe). Implement controls such as redundancy (double-reading by humans) or “unknown alert” outputs.
- Model Drift: Over time, the real-world data distribution may shift (new equipment, patient population changes). This is a hazard of obsolete performance. The risk analysis must consider how changes in environment can degrade the model, and implement monitoring controls (see Post-Market section).
- Adversarial Attack: AI algorithms can be susceptible to deliberate attacks (e.g. subtle image manipulations). This cybersecurity risk must be treated similarly to software hacking. The QMS should ensure threat modeling and incorporate adversarial testing if relevant.
- Opacity/Explainability: A hazard unique to “black-box” AI is that clinicians or operators may not understand why a decision was made, potentially leading to misuse. Labeling and training may serve as risk controls – the QMS should require that user manuals explain AI limitations. The risk of user overtrust can be mitigated by requiring a human-in-the-loop on critical decisions.
The QMS must ensure that risk management is not a one-time checkbox but an ongoing process. As post-market data come in, risk assessments should be revisited. FDA/IMDRF guidance on AI encourages iterative risk evaluation throughout development, programming users to monitor flagged cases ([30]). If a new potential hazard is discovered (e.g. a drop in sensitivity on a subgroup), the QMS CAPA process should trigger hazard reevaluation under ISO 14971 and possibly label updates or retraining plans.
Strengthening AI risk management in the QMS can draw on emerging guidance. For example, Lopez Clavijo et al. emphasize using structured risk management (AAPM Task Group 100 methodology) to integrate AI tools into clinical risk processes ([31]). NHS’s AMLAS guideline (aligned with IEC 61508 safety cases) is a comprehensive procedure that can fit within a QMS risk analysis phase ([32]) ([33]). Incorporating such frameworks ensures that AI-specific risks are systematically identified, documented, and mitigated.
Data Lifecycle Management
For AI/ML devices, the data used to develop and evaluate the system are as critical as hardware components. A rigorous QMS must cover the entire data lifecycle. Key aspects include:
-
Data Acquisition and Quality: Document where and how training data were obtained (e.g. retrospective clinical image archive, simulation data). The QMS should specify acceptance criteria for data (minimum resolution, annotation quality). Raw data that are incorporated into the training set should be subject to quality checks (accurate labels, completeness) and version control. If synthetic or augmented data are used, document generation methods.
-
Data Annotation and Labeling: QMS procedures should define who (role/credentials) annotates data and how consistency is ensured. For instance, pathologists’ labels on histology images should be verified as needed. Any changes (correcting labels) go through a formal data change control process.
-
Data Partitioning and Traceability: For verification/validation, QMS must ensure that training, validation, and test datasets are properly isolated (no data leakage). Each dataset partition should be documented and static after splitting. Traceability links data sample (or batch) back to source and label. Some manufacturers use data management tools (MLflow, DVC) which can integrate with a QMS to log these details.
-
Privacy and Security: Patient data used in AI/ML must respect privacy regulations (HIPAA/ GDPR). The QMS needs procedures for secure data handling and de-identification. Compliance with data governance (consent management, audit logs) should be maintained.
-
Data Updates and Continual Learning: If the device design includes retraining on new data (in-field learning), the QMS must control that process. For example, before releasing an updated model trained on new patient data, the QMS should require re-validation. Alternatively, if a device only gets periodic “locked” updates, each update is handled as a new device version in the QMS.
Overall, data in the context of AI become “controlled components” of the device. Just as ISO 14644 for cleanrooms defines cleanliness at all stages, an AI developer’s QMS should track data lifecycle quality: acquisition, annotation, storage, processing, and archival. Inadequate data control can lead to “garbage in, garbage out,” which defeats the purpose of a medical QMS.
Software Development and Implementation
AI/ML devices are implemented as software. Thus, standard device software QMS practices apply, amplified by AI’s needs.
-
Software Development Life Cycle: The QMS should enforce use of a structured development lifecycle (e.g. Agile, V-model) documented under IEC 62304 standards ([8]). As noted, practices like Agile are permissible but must be reconciled with design control requirements ([9]). Each version/release of the algorithm/software should go through verification testing (unit/integration testing) recorded in the QMS.
-
Code Management and Security: For custom AI code (typically Python or similar), version control (e.g. Git) is mandatory. The QMS must include procedures for branching, merging, and release of software. Secure coding standards and code review checklists (for vulnerabilities or pretrained library vetting) should be part of the development SOP. Supply chain transparency (e.g. Software Bill of Materials, SBOM) is becoming required by law ([34]), so the QMS should incorporate SBOM generation for all third-party components to mitigate supply-chain risks.
-
Model Implementation: Many AI models are deployed as compiled binaries or hardware accelerators, not just source code. The QMS must treat model files (e.g. TensorFlow graph, compiled inference engine) as software deliverables. Installation or integration procedures should include model integrity checks (hashing) and configuration management.
-
Hardware/Software Integration: If the AI runs on a device (e.g. an imaging machine), the QMS must ensure compatibility and performance requirements (timing, resource usage) are verified. Interfacing with user interfaces, alarms, and logging must meet IEC 62366 human factors and IEC 60601 (risk transitions at hazards).
Adhering to established software QMS standards ensures that common software defects are controlled. However, QA for AI also includes non-traditional testing: for example, robustness testing with noisy or adversarial inputs, performance on outlier cases, and usability studies to see how clinicians interpret AI outputs. The results of such tests should feed back into the design control and risk management process. In effect, the QMS treats the trained model as “code” that must be rigorously tested in scenarios beyond normal operation.
Verification, Validation, and Testing
A centerpiece of the QMS is the Verification and Validation (V&V) of the device against requirements. For AI/ML devices, this includes both traditional software tests and clinical-performance evaluations:
-
Device Verification: Before clinical validation, engineers verify algorithms against software requirements. Unit tests check code correctness, integration tests verify component interactions, and system tests ensure the AI runs correctly in the device environment. Automation (continuous integration) can be leveraged, but each test must be traceable to requirements in the DHF. Verification reports become part of the QMS records just as with any software feature.
-
Model Performance Validation: Unique to AI is the need to statistically validate model performance. This is often done using retrospective test datasets (external to training). The QMS should include procedures for defining validation datasets (size, representativeness) and metrics (e.g. ROC curves, confusion matrix). These validation studies should be planned in the design phase and documented; their results are design verification evidence. For high-risk devices, the validation may require prospective clinical studies (which should be outlined in the QMS’s clinical evaluation plan).
-
Usability/Human Factors Testing: If AI outputs are intended for clinicians, the QMS should mandate human factors testing (IEC 62366) to ensure users can understand and apply AI recommendations safely. For example, if an AI flags a CT image finding, a test of clinician interpretation and reliance could be required. QMS documentation should cover user training materials and labeling instructions specific to the AI behavior.
-
Robustness and Stress Testing: The QMS should encourage stress testing under unusual conditions: e.g., feeding slightly altered inputs to check for brittle behavior, or simulating sensor noise to see how it affects outputs. Such tests help identify hidden failure modes. Results should be recorded in a “software validation report” or equivalent.
-
Explainability and Transparency: Though not strictly mandated for all AI, many regulators and standards bodies emphasize understanding AI decisions. The QMS could include verification of explainability features (e.g. does the model output confidence scores or saliency maps). These become part of the design outputs.
Validation activities should be risk-informed: higher-risk tasks (like false negatives on cancer) demand more rigorous testing (even RCTs) ([11]). In fact, recent analysis showed that only 1.6% of FDA-cleared AI devices had data from randomized trials ([11]). The QMS must push for stronger evidence: for example, requiring a minimum level of retrospective validation before clearance, and robust post-market validation plans.
All V&V results become QMS records (verification test reports, clinical evaluation reports). Any residual issues uncovered require CAPA action (see next section).
Post-Market Surveillance and Maintenance
After an AI device is on the market, its QMS must ensure continued safety and performance. This involves:
-
Continuous Monitoring: The manufacturer should collect real-world data on device performance and user feedback. Under post-market surveillance (PMS) regulations, adverse events and malfunctions must be reported. For AI, this means tracking cases where the AI’s output might have contributed to a problem (e.g. a misdiagnosis reported). The QMS should define how field data is gathered (logfiles, registries, user surveys) and how often it is reviewed.
-
Model Performance Drift: A key QMS task is monitoring for model drift: subtle degradation if the data distribution changes. For example, a chest X-ray dataset might shift if a hospital upgrades its imaging equipment. The QMS should require periodic re-validation of the model on new data batches. If performance falls below the originally approved threshold, a CAPA is triggered (possibly leading to model retraining and update).
-
Software Updates: Normal CAPA-driven updates (bug fixes, non-AI software patches) follow standard QMS change control. But AI-specific updates (model retraining) should follow the Predetermined Change Control Plan if one exists ([4]). The QMS must ensure that any new model version passes verification/validation before replacing the old one in the field. Maintain a log of deployed model versions with date and location to track which patients saw which version.
-
User Training and Feedback Loop: The QMS often encompasses activities for user training and documentation updates. For AI, manufacturers should train clinicians on interpreting AI outputs. The QMS can collect clinician feedback (e.g. through training evaluations or error reports) to spot usability issues. Rapid learning from user feedback can then feed back into design improvements or warning communications.
-
CAPA for AI Hazards: If surveillance identifies a recurring issue tied to the AI (such as a certain image artifact causing false alarms), the QMS CAPA process must analyze root cause (possibly data-related) and update either the algorithm or the labeling/training. This is similar to bug-fixing but with additional data validation steps.
In essence, post-market actions for AI/ML devices mirror those of other devices but with extra vigilance on data and algorithmic performance. The FDA and other bodies highlight the importance of robust PMS for AI. For instance, Lin et al. (2025) found that only 5.2% of cleared AI devices had any reported serious adverse events, but with one reported death; they called for “robust postmarket surveillance” ([35]). A well-designed QMS addresses this by ensuring a feedback loop from real-world use to quality decisions.
Change Management
Change control is a pillar of any QMS. For AI/ML devices, change management is particularly critical and complex:
-
Baseline Locking: Before initial commercial release, the device configuration (software and model version) must be locked. The design history file should record the final version that is approved. Any subsequent changes are handled under change control procedures.
-
Planned Changes (PCCP): If the manufacturer intends the AI to learn from new data and evolve, a Predetermined Change Control Plan (PCCP) should be pre-approved. The QMS must store this PCCP – essentially, a protocol describing the allowed modifications (e.g. periodic retraining on specified data streams) and the risk mitigation at each step. Future updates following the PCCP (now sometimes called “Algorithm Change Protocol” or “SaMD Pre-Specification” ([4])) are then treated as normal QMS changes without needing full re-clearance, provided they stay within the plan’s scope.
-
Unplanned Changes: If issues arise requiring urgent changes (e.g. a critical vulnerability or an discovered bias), the QMS CAPA process will result in an out-of-cycle update. Even in these cases, the changes must be documented, risk-assessed, and regression-tested. For locked (traditional) software, any change typically needs a new submission to regulators. For AI/ML, if no PCCP is in place, significant algorithmic changes might require re-approval. Thus, QMS must carefully track when changes exceed the PCCP and enforce the regulatory submission.
-
Documentation: Every change (code patches, parameter tweaks, retraining events) must be logged in the master record. The QMS should ensure that customers or users are notified (via labeling or software update notes) of any change that could affect use. Change control forms within the QMS should capture the reason, verification, validation, and approval of every AI-relevant change.
Regulators are focusing on change control because AI devices theoretically could adapt indefinitely. The QMS must therefore marry agility with safe governance, ensuring that changes only happen through controlled processes. Embedding these controls in the QMS (with evidence of review and approval) is vital for audit and for patient safety.
Case Studies and Evidence
Understanding the practical implications of QMS decisions is aided by examining real examples and data. The following cases and studies illustrate common pitfalls and successes in AI/ML medical device quality management.
Case Study 1: Early Recalls of AI-Enabled Devices – A 2025 JAMA Heath Forum study analyzed FDA recall data for AI devices ([2]). Among 950 cleared AI devices from 2015–2023, 60 had one or more recalls (~6.3%). Alarmingly, 43.4% of these recalls occurred within the first 12 months post-clearance – about double the recall-rate of all 510(k) devices ([2]). Common causes included diagnostic/measurement errors (often due to the AI component), functionality delays, and physical hazards (e.g. malfunctions secondary to software issues). This early recall rate suggests that initial clinical validation and design verification may have been insufficient. Notably, devices that lacked robust premarket validation (no retrospective or prospective studies) had higher recall rates ([36]). This underscores the QMS imperative to enforce thorough verification: better design verification and clinical evaluation procedures (dictated by QMS design controls and validation activities) might have prevented many of these recalls.
Case Study 2: Inadequate Documentation of Training Data – In a 2025 cross-sectional analysis of 691 FDA AI/ML devices, key elements of performance evaluation were frequently unreported ([11]). For example, over half did not document training sample sizes, and 95.5% failed to report demographic breakdowns. Moreover, only 1.6% had any randomized trial data. Such gaps in documentation are direct targets for QMS improvement: a quality system should mandate complete records of data sources and validation methods. The study authors concluded that this lack of standardized reporting “underscores the need for dedicated regulatory pathways and robust postmarket surveillance” ([35]). A QMS oriented towards data transparency (e.g. requiring a data registry in the design file) would address this deficiency.
Case Study 3: Sleep-Well Baby (SWB) Project – This in-house AI monitoring system tested in a neonatal ICU illustrates a comprehensive approach to QMS integration. SWB was a clinical research project using ML to track newborn breathing. The UMC Utrecht team treated it as a Class IIa device under the European MDR. Their perspective paper describes how they embedded quality control at every step. For instance, they leveraged ISO 15189 (the medical lab QMS standard) as a blueprint for managing AI in the clinical setting ([10]) ([37]). The hospital created roles (ULs and HealtQA experts) to oversee data integrity and model oversight. They maintained data pipelines, performed iterative validation, and defined procedures for integrating SWB outputs into clinical workflows. This “insights learning journey” highlighted that user-level QMS (at healthcare organizations) is also crucial: the paper argues that user responsibilities (like daily checks of model behavior) should be part of the QMS, complementing the manufacturer’s system ([10]). Lessons from SWB reinforce that QMS is not solely a manufacturer function – institutions deploying AI must adopt quality disciplines too.
Case Study 4: International Device Examples – Globally, many manufacturers are incorporating AI into their QMS. For example, Philips’ FDA-cleared ultrasound Caption Guidance uses AI to automatically identify acoustic windows. Philips’ regulatory filings (not publicly available) likely document extensive validation protocols. Similarly, IDx-DR (approved diabetic retinopathy screener) underwent a pivotal trial; its company was required to have a documented QMS and risk plan for patient management if the AI was wrong. Though proprietary, these success stories implicitly followed robust QMS processes: collecting large datasets, cleaning/annotating images, defining software development SOPs, conducting clinical studies, and planning post-market oversight. The manufacturers’ ISO 13485 certificates and FDA approvals attest to their QMS alignment with AI requirements.
These examples show that deficiencies in QMS activities often lead to product failures, whereas thorough QMS practices can support eventual success. Key takeaways include: (1) early and thorough validation is crucial to reduce recalls; (2) complete documentation of data and validation as required by QMS can satisfy regulators and clinicians; and (3) lifecycle QMS (covering both producers and end-users) enhances safe implementation.
Integrating AI into the QMS Framework
Having examined requirements and evidence, we now turn to practical integration of AI/ML considerations into each QMS element. The following analysis connects ISO 13485 (or 21 CFR 820) clauses and typical QMS processes to AI-specific actions, illustrating how companies should adapt their QMS.
Table 2 below outlines major QMS elements and supplemental AI/ML considerations for each.
| QMS Element (ISO 13485 Clause) | Conventional Focus | AI/ML Adaptations and Considerations | Guidance/References |
|---|---|---|---|
| Design Controls (7.3) | Define user needs, design inputs (requirements), design outputs; verify and validate design. | Include data requirements as inputs: Define data types, volumes, sources needed. Algorithm design: Document model architecture, feature set. Outputs: Save trained model artifacts; record performance metrics. Verification: Add ML-specific tests (e.g. validation on hold-out data). | FDA 21 CFR 820.30; Henry & Thiel (2022) ([9]), AAMI TIR45 (agile) |
| Document Control (4.2) | Control documents and records (papers, drawings, software) via versioning, approvals. | Track data & model docs: Control datasets, annotations, and model files with revisions. Model cards & datasheets: Create transparency documents for each algorithm version. Faster updates: Review schedules for QMS docs may be shortened to adapt to AI tech pace ([27]). | ISO 13485:6.4; Overgaard (2023) QMS defn ([19]) |
| Risk Management (7.1) | ISO 14971 hazard analysis, risk evaluation and control. | Include AI hazards: Document risks like bias, drift, adversarial attacks, overfitting. Controls: Design redundancy (e.g. human review), data validation checks, anomaly detection systems. Periodic review: Re-assess risks when software/data change. | ISO 14971:2019; IEC TR 80002-1 guidance ([8]), AMLAS; Chen et al. (2025) recalls ([3]) |
| Design/Dev Planning (7.3.2) | Maintain design plans and development procedures. | AI-specific planning: Include ML lifecycle methods in project plan (data gathering, model training, tuning). Toolchain documentation: List software frameworks, data engineering tools used. GMLP: Adopt good ML practice principles in plans (e.g. data quality steps). | AAMI TIR45; FDA GMLP (whitepaper) |
| Software Verification (7.5.4) | Perform verification of design to ensure outputs meet inputs (requirements). | Performance verification: Beyond code testing, verify model metrics (accuracy, ROC, confusion matrices) meet clinical requirements. Edge cases: Test robustness with noise/outliers. Traceability: Link validation results to design docs. | FDA SaMD guidance; Lee et al. (JAMA HF) recall analysis ([2]) |
| Validation & Clinical Evaluation (7.3.6) | Validate that device meets user needs (orientation, environment), including clinical trials if needed. | Clinical validation: Plan and execute studies on end-to-end accuracy and outcomes, not just bench metrics. Real-world testing: Pilot deployments for feedback. User training in validation: Evaluate usability of AI alerts. | IEC 62304; IMDRF SaMD clinical evaluation guidance; Lin et al. (2025) FDA review gaps ([11]) |
| Production and Process Controls (7.5) | Ensure manufacturing processes are controlled and documented. | Data processing controls: If device retrains in field, processes for data pipeline controls. Software config control: Detailed build instructions for AI software. Cloud deployment: If using remote models, ensure secure transfer processes are part of production. | 21 CFR 820.70; ISO 62304 artifact control |
| Purchasing & Supplier Control (7.4) | Qualify/integrate suppliers of materials/components. | Third-party AI components: Evaluate off-the-shelf ML libraries or pre-trained models for compliance (e.g. open-source model licensing risk). Data sources: Ensure third-party data sets (if any) meet privacy and quality specs. | FDA 21 CFR 820.50; cybersecurity guidance; Censinet blog on SBOM ([34]) |
| Corrective & Preventive Action (8.5) | Identify and correct nonconformities; take preventive actions. | AI-specific CAPA: If a model fails in use (e.g. systematic misdiagnosis), CAPA triggers model update. Root cause: May involve data re-annotation or algorithm tweaking, documented step-by-step. Preventive: Monitor model drift trends and adjust before failures. | ISO 13485:2016; PCCP concepts (risk-based changes) ([12]) |
| Post-Market Surveillance (8.2.2) | Collect feedback, adverse events, and monitor product in market. | Performance monitoring: Systematically gather outcome data; update AI performance dashboards. Complaint handling: Include AI-specific symptom categories (e.g. "algorithm error"). Reporting obligations: Promptly report adverse events that involve algorithm faults. | MDR Annex III (USA postmarket reqs); Overgaard (2023) QMS lifecycle ([17]) |
| Internal Audit and Review (8.2.4) | Ensure QMS is functioning via audits and management reviews. | AI audit topics: Audit data management practices, model update logs, and compliance with PCCP. Training: Confirm staff have AI/ML competency. Management Review Inputs: Include KPIs for AI product performance and quality findings from real-world data. | ISO 13485:2016 8.2; QMS standards (FDA QMSR FAQs) ([13]) ([19]) |
Table 2: Mapping of traditional QMS elements to AI/ML-specific actions. Manufacturers should augment each process with these considerations to ensure robust quality for AI devices. (Sources: regulatory standards and expert sources ([9]) ([8]) ([19]) ([12]).)
In Table 2, the left column lists standard QMS process areas (clause numbers refer to ISO 13485). The table shows how each area can be expanded for AI/ML devices. For example, under “Risk Management,” manufacturers must add new hazard categories like “algorithmic bias” or “model drift,” and define control measures. Under “Design Controls,” one must include datasets and model versions as part of the design history. The rightmost column references guidance such as FDA rules or standards relevant to each point.
This mapping should guide companies in performing a gap analysis of their existing QMS. For instance, if an internal audit finds that “data versioning” is not part of document control, that is a gap to address. Many of these AI-specific actions are currently recommended practices (from sources like the FDA’s PCCP guidance or ISO 42001). However, they will increasingly be enforced as regulators become more specific.
AI Explainability, Transparency, and Ethics
While not always a formal QMS requirement, explainability and ethics are crucial for trust. The QMS can foster transparency by enforcing documentation of the AI’s logic and decision boundaries. Labeling requirements should be treated as part of the QMS: e.g. device labeling must include information on the AI’s intended use limits and failure modes. Ethical considerations (e.g. fairness, equity) can be incorporated into risk management and quality objectives. For instance, a QMS preventive action could be conducting an equity audit (testing AI across demographics) and documenting the outcomes.
Industry groups and regulators emphasize “trustworthy AI.” A QMS approach to ethics might involve an “AI ethics review” step in design control where biases are assessed. While not strictly mandated under medical device QMS standards yet, building this into the QMS (perhaps under “management responsibility” or “design inputs”) is forward-looking.
Tools, Automation, and a Digital QMS
Manufacturers are increasingly using digital QMS platforms (often called eQMS) to manage records, workflows, and audits. Many such platforms now incorporate AI assistants (ironically) for tasks like document review or identifying nonconformities patterns. When building an AI device QMS, leveraging software tools can improve consistency. For example, version control systems for code and data can be linked to the QMS, ensuring traceability. Quality management software can alert when a dataset or software component has been updated, triggering the needed verification workflows.
However, the key is to not overly automate without oversight: AI-based tools assisting in QMS must themselves be validated (if they produce decisions affecting compliance). The QMS should include a “software tools qualification” procedure for any AI used in quality processes.
Data Analysis and Evidence
To reinforce the analysis, here we present aggregated data and findings relevant to QMS for AI/ML devices:
-
AI in Devices is Growing Rapidly. As of mid-2025, the FDA’s public database lists over 878 AI/ML-enabled devices cleared for marketing ([1]). The distribution covers specialties: radiology and cardiology are the largest categories, but oncology, neurology, and endocrinology (e.g. diabetes retinopathy) are well represented. The annual approval rate has been climbing; trends show exponential growth year over year. This adoption underscores the urgent need for QMS frameworks to keep pace with technological integration.
-
Recalls Are Notable for AI Devices. The two recent studies cited earlier provide quantification. One reported that 6.3% of AI devices had recalls ([2]). Another spanning 27 years found that AI device recalls often stem from design deficiencies: “device design and software design accounted for 50% of recalls” ([3]), significantly higher than for non-AI devices. These recall data highlight that design and development (under the QMS) are the most critical areas for AI devices (since software issues dominate).
-
Clinical Evidence Gaps. In addition to recall statistics, findings on evidence are sobering. The JAMA Health Forum study showed that nearly half of AI devices lacked disclosed validation study details ([11]), making it difficult to assess readiness. It also found that publicly traded companies had more recalls and less clinical evidence than startups. While beyond direct QMS control, this underscores the QMS role in documenting and, ideally, performing thorough validation.
-
International Guidance Trends. Our review of regulations found a consistent trend: wherever AI/ML devices are in play, quality system requirements are being updated. The FDA’s 2026 cybsersec-QMS guidance ([18]), the EU AI Act Article 17 ([6]), and industry reports all confirm that QMS for AI must be explicit. For example, the FDA’s QMSR FAQ explicitly recognizes the shift: “the revised part 820 is now titled the Quality Management System Regulation (QMSR)” ([13]), acknowledging broader responsibilities beyond simply “Quality System.”
-
Standards and Framework Maturity. ISO/IEC 42001:2023 was approved in late 2023; it provides auditable requirements for an “AI Management System” analogous to ISO 13485 but focusing on AI governance. This standard includes controls on data quality, model fairness, and lifecycle monitoring. Early adopters (e.g. medical AI startups) are beginning to align with ISO 42001 in parallel with ISO 13485, aiming for double certification. Another upcoming piece is the IMDRF’s Good ML Practice guidance, expected to formalize many best practices. The migration of these high-level frameworks into the QMS documentation will accelerate (e.g. incorporating ISO 42001’s requirements into the quality manual for AI).
Future Directions and Implications
As AI/ML technologies continue to advance, the future of QMS for medical devices will involve even tighter integration of data science with quality assurance. Important trends include:
-
Regulatory Harmonization: Expect further global alignment. FDA, EU, Canada, and others are collaborating on PCCP and GMLP. International bodies (IMDRF, WHO) may produce consensus standards on AI risk management. Manufacturers should design their QMS to satisfy multiple jurisdictions’ requirements, potentially through modular documentation (e.g. one QMS chapter covers EU AI Act, another FDA, etc.).
-
Continuous Learning Systems: Currently, only “locked” AI systems (where learning is not happening in the field) are fully approved. In the future, regulators may allow controlled, real-time learning systems. This will require formalizing continuous quality assurance processes: e.g. automated performance checks embedded in the device, with corrective retraining catches in the QMS loop.
-
AI-Assisted QMS: Interestingly, AI itself may help manage QMS complexity. Some quality software now uses machine learning to predict nonconformity trends or to recommend audit foci. For instance, anomaly detection algorithms could flag unusual incident reports. In the long term, a mature QMS for AI devices might include AI-driven analytics of quality data to proactively identify risks (a “Quality 4.0” vision).
-
Standards Evolution: Standards will adapt. Already, work is underway on extending ISO/IEC 62304 to better address machine learning, and on creating sector-specific AI standards (e.g. for anesthesia monitoring). Advocacy from manufacturers and trade groups will shape these updates. Component standards like IEC 60601 (safety of medical electrical equipment) may get annexes on AI considerations (some proposals exist).
-
Training and Workforce: A QMS is only as effective as the people implementing it. The rise of AI devices drives demand for “quality engineers” with data science knowledge. Organizations may establish cross-functional teams (data scientists, clinicians, quality engineers) within the QMS structure. Training programs will need to cover both quality principles and AI technical skills.
-
Ethical and Societal Impact: Beyond regulation, society’s concern with AI ethics will influence QMS. We may see formal “AI ethics audits” integrated into quality reviews, checking for bias mitigation, transparency, and accountability. Transparency obligations (e.g. the EU AI Act’s data governance requirements) will lead companies to make parts of their QMS publicly accessible (e.g. publishing conformity assessments).
In conclusion, building a QMS for AI/ML medical devices is not a one-time task but an evolving discipline. Companies must continuously update their QMS to reflect new knowledge, data, and regulatory expectations. Institutions deploying AI devices must also develop organizational QMS-like safeguards to ensure these tools benefit patients safely ([10]). The future promises both stricter oversight and new tools to help manage complexity – but the core QMS principles of documentation, review, and improvement will remain essential in safeguarding the quality of AI-driven care.
Conclusion
As AI and machine learning become integral to medical technology, the Quality Management System underpinning medical device manufacturing must adapt comprehensively. This report has explored the foundations, adaptations, and evidence for constructing an AI/ML-focused QMS. The key points are:
-
Integration with existing QMS standards: AI devices are still medical devices; thus ISO 13485/21 CFR 820 (QMSR) remain the basis. However, the content of procedures under these standards must explicitly cover data and algorithmic elements (as detailed in Tables 1 and 2).
-
Regulatory alignment: Active engagements from FDA, EU regulators, and international bodies highlight that robust QMS controls (especially on change management and post-market monitoring) are now formal requirements for AI devices. For example, FDA’s new Quality Management System Regulation (Part 820 renamed QMSR) and the EU AI Act’s QMS mandates ([13]) ([6]). Compliance with emerging standards like ISO/IEC 42001:2023 further ensures readiness.
-
Evidence-driven quality: Empirical studies of FDA-cleared AI devices reveal gaps in validation and early failures ([2]) ([11]). These underscore the necessity of stringent QMS practices. Manufacturers should treat data curation, model testing, and validation as core components of quality, not afterthoughts.
-
Lifecycle management: AI/ML systems require continuous lifecycle oversight. The QMS must track model versions, manage updates via predetermined change protocols, and enforce ongoing surveillance of performance―ensuring any drift or novel hazard is caught. The PCCP concept and similar proposals provide a framework for this within regulated change control.
-
Collaboration and transparency: Both manufacturers and healthcare users share responsibility. Vendors’ QMS must be complemented by healthcare organizations’ quality measures (e.g. the ISO 15189-inspired approach for clinical AI) ([10]). Open reporting (of data and adverse events) is critical; one study found that nearly all AI devices lacked demographic transparency in submissions ([11]), a shortfall future QMS policies should rectify.
-
Looking ahead: The QMS for AI/ML medical devices will mature alongside AI itself. We foresee machine learning being used within QMS tools (predictive analytics for quality), and QMS concepts feeding into new AI guidelines and ethical frameworks. As regulators worldwide implement AI-specific rules, a strong QMS will be a competitive advantage, not just a compliance checkbox.
In summary, building a Quality Management System for AI/ML medical devices means extending quality assurance into the data and algorithm realms. It requires cross-disciplinary expertise—combining regulatory knowledge, clinical insight, and data science. This report lays out the landscape and concrete steps, with extensive references to current regulations, studies, and expert guidance, to inform stakeholders in this critical endeavor. By following these principles, manufacturers can better ensure that AI-driven medical devices achieve their promise safely and reliably, now and in the future.
References
- USA FDA. Quality Management System Regulation – Frequently Asked Questions. (Update Feb 2, 2026) https://www.fda.gov/medical-devices/quality-system-qs-regulationmedical-device-current-good-manufacturing-practices-cgmp/quality-management-system-regulation-final-rule-amending-quality-system-regulation-frequently-asked ([13]).
- Fordyce A. Under the Hood: Part 820 to be Replaced by QMSR. Greenlight Guru blog. Dec 2025. (citing new requirements in QMSR).
- Margalit P, et al. Early Recalls and Clinical Validation Gaps in Artificial Intelligence–Enabled Medical Devices. JAMA Health Forum 2025;6(8):e253172 ([2]).
- Chen W-P, et al. Regulatory Insights From 27 Years of AI/ML-Enabled Medical Device Recalls in the US. J Med Internet Res (2025);27:e67552. pii:65 (Abstract, July 11, 2025) ([3]).
- Lin JC, et al. Benefit-Risk Reporting for FDA-Cleared AI-Enabled Medical Devices. JAMA Health Forum 2025;6(9):e253351 ([11]) ([35]).
- Overgaard SH, et al. Implementing quality management systems to close the AI translation gap and facilitate safe, ethical, and effective health AI solutions. npj Digit Med 6, 218 (2023) ([19]) ([17]).
- Bartels RG, et al. A perspective on a quality management system for AI/ML-based clinical decision support in hospital care. Front Digit Health 2022;4:942588 ([10]) ([37]).
- Henry E, Thiel S. Using Existing Regulatory Frameworks to Apply Effective Design Controls to AI/ML in Medical Devices. Regul Rapporteur 2022;56(4):114–118 ([9]) ([5]).
- FDA CDRH News Release. Guiding Principles for Predetermined Change Control Plans for Machine Learning-Enabled Medical Devices. Oct 24, 2023 ([4]) ([12]).
- FDA. Artificial Intelligence and Machine Learning (AI/ML)-Enabled Medical Devices. (device listings) https://www.fda.gov/medical-devices/software-medical-device-samd/artificial-intelligence-and-machine-learning-aiml-enabled-medical-devices ([1]).
- FDA. Good Machine Learning Practice for Medical Device Development: Guiding Principles. MDIC/IMDRF (2018–21). Available from FDA website.
- IEEE/ISO/IEC. ISO/IEC 42001:2023 – Artificial intelligence management systems. (new standard, 2023) ([14]).
- FDA. Cybersecurity in Medical Devices: QMS Considerations and Content of Premarket Submissions. Final Guidance Feb 2026 ([18]).
- FDA. Predetermined Change Control Plans for Machine Learning-Enabled Medical Devices: Guiding Principles (Draft, 2019; finalized 2023) ([4]) ([12]).
- 8foldGovernance. Expert Insights: How AI will impact your ISO 13485 Quality Management System. July 18, 2025 ([26]) ([25]) ([27]).
- 8foldGovernance. The Latest Guidance for AI and Machine Learning in Medical Devices (V2). June 17, 2024. (See sections on AMLAS and standards) ([32]).
- MedDeviceOnline. Narayanan S, Mhatre A. Building the AI-Enabled Medical Device QMS for European Compliance. Feb 6, 2026 ([7]) ([21]).
- IEC/ISO Standards: IEC 62304:2015 (Medical device software lifecycle), IEC 82304-1:2016 (Health software safety), ISO 14971:2019 (Risk management).
- AAMI. TIR45:2012 – Guidance on the use of AGILE practices in the development of medical device software.
- NHS DCB0129, DCB0160. Clinical risk management – NHS code of practice for automated clinical systems. (UK)
- Chen L, et al. Benefit-risk of AI/ML-enabled devices (JAMA Network). (Referenced for safety risk categories)
- Pharmacopeial/Standard References: FDA QSFAQ; IMDRF SaMD WG publications; MHRA AI guidance (UK).
(This reference list is illustrative and not exhaustive. URLs for government and standard documents should be consulted directly where needed, as indicated.)
External Sources (37)

Need Expert Guidance on This Topic?
Let's discuss how IntuitionLabs can help you navigate the challenges covered in this article.
I'm Adrien Laurent, Founder & CEO of IntuitionLabs. With 25+ years of experience in enterprise software development, I specialize in creating custom AI solutions for the pharmaceutical and life science industries.
DISCLAIMER
The information contained in this document is provided for educational and informational purposes only. We make no representations or warranties of any kind, express or implied, about the completeness, accuracy, reliability, suitability, or availability of the information contained herein. Any reliance you place on such information is strictly at your own risk. In no event will IntuitionLabs.ai or its representatives be liable for any loss or damage including without limitation, indirect or consequential loss or damage, or any loss or damage whatsoever arising from the use of information presented in this document. This document may contain content generated with the assistance of artificial intelligence technologies. AI-generated content may contain errors, omissions, or inaccuracies. Readers are advised to independently verify any critical information before acting upon it. All product names, logos, brands, trademarks, and registered trademarks mentioned in this document are the property of their respective owners. All company, product, and service names used in this document are for identification purposes only. Use of these names, logos, trademarks, and brands does not imply endorsement by the respective trademark holders. IntuitionLabs.ai is an AI software development company specializing in helping life-science companies implement and leverage artificial intelligence solutions. Founded in 2023 by Adrien Laurent and based in San Jose, California. This document does not constitute professional or legal advice. For specific guidance related to your business needs, please consult with appropriate qualified professionals.
Related Articles

EU MDR & AI Act Compliance for AI Medical Devices
Understand EU MDR and AI Act compliance for AI medical devices. Explains classification, conformity assessment, and technical documentation requirements.

AI Medical Device Cybersecurity: Regulations & Risks
Learn cybersecurity requirements for AI medical devices. Covers FDA guidance, EU regulations, SBOM mandates, and defenses against adversarial AI attacks.

IEC 62304 vs FDA CSA: A Medical Software Compliance Guide
Compare the IEC 62304 standard vs. FDA CSA guidance for medical software. Learn key differences in scope, risk classification, validation, and documentation.